[LCP] input needed

David Filion david at filiontech.com
Fri Jul 20 14:09:19 UTC 2007


Arjun satish wrote:
> Try to use regular expressions to group your text. There are many 
> libraries which give you Reg Exp support. If the text you are looking 
> for always occurs at a certain point in a line, just use the plain 
> getline and then use string apis like strchr and strtok to extract 
> your data.
>
> Can you give an example of the text file and the messages you are 
> looking for? Then probably we can you more core info.
>
> Cheers,
> Arjun
>
> On 7/20/07, * Zach* <netrek at gmail.com <mailto:netrek at gmail.com>> wrote:
>
>     I would like to write a parser for a game log file (ASCII text).
>     There
>     will be many different sorts of text I will need to identify and
>     group. I guess I need first to decide on existing sentinels in the
>     data I can use. I am wondering what would be the best way to find and
>     process the data files. The final output will be a colorized HTML set
>     of files with frames, one frame corresponding to each of a 4 different
>     message types. A lot of data in the files I will just ignore, I just
>     want basically to extract the messages people type to one another and
>     to their team, global or personal message boards. So should I just
>     read one line at a time or read chunks?
>     I heard in C there is a couple different ways I could do this task.
>     Any ideas with code snippets or psuedocode would be helpful.
>
>     Zach
>

I agree, regular expressions are the way to go for this.  I'm sure 
google will turn up a list of regex libs, though you may want to start 
with a search at SourceForge, or even better your distros package 
manager.  You may want to use perl/python or one of their friends to 
prototype your code or even write the final product.  They both have 
good regex support.  I use python for parsing apache logs near the 1gig 
mark without a problem. (I tried it with Ruby last year, and while it 
was easy to code, the memory/cpu usage wasn't acceptable).


David f.




More information about the linuxCprogramming mailing list