| TIP: Click on subject to list as thread! | ANSI |
| echo: | |
|---|---|
| to: | |
| from: | |
| date: | |
| subject: | Re: C++ |
From: "Paul Ranson"
How do you parse the data to get it into SQL Server? ISTM that this is the
crux of your particular application, and unfortunately that's nothing much
to do with arrays or files.
If you use a 'std::string' rather than your char buffer then you don't have
to worry about the length of the input line, although setting a plausible
maximum is sensible.
There's no point in storing the word 'SPAM' thousands of times, rather
convert 'SPAM', 'NOTSPAM', whatever your possibilities are into numbers.
Similarly with the time, message numbers, IP addresses etc.
I would use a formal parsing framework for this, especially since that
would handle bogus input cleanly.
What kind of statistics do you want to acquire? A really simple first
approach would be to simply store each line as a string in a vector, and
then search your vector for all strings that contain 'SPAM', or all strings
that contain 'SPAM' and a particular 'from' address. This would be
achievable in short order, not neccessarily very elegant or useful, but
achievable.
Paul
"Geo" wrote in message
news:4194066f$1{at}w3.nls.net...
> "Paul Ranson" wrote in message
> news:4193fdf1{at}w3.nls.net...
>
> std::ifstream is ( "MyFileName" ) ;
> std::string sbuf ;
> while ( is.good )
> {
> is.getline ( sbuf ) ;
> rows.push_back ( LogFileRow ( sBuf )) ;
> }
>
> But it's probably worth considering which elements you want to store and
> what a natural way to store them is, for instance there's no point in
> storing times or IP addresses as strings, convert them in LogFileRow.
>
> I'm guessing some of the stuff above looks a bit odd...
>
> Paul
>
> ==================
>
> I don't know buffers yet so here is how I was bringing in the file
>
> char filline[256], ch;
>
> while ((ch=infile.peek()) != EOF)
> {
> infile.getline(filline, 256, '\n');
> cout << filline;
> }
>
> I then need some way to go thru the rows (as I read them in) which looks
> like this:
>
> SPAM 11 Nov 2004 00:08:06.429 H 08082 57877 66.214.143.78
> sheldon{at}accel.net
> thorn{at}nls.net Content scan failure - see http://www.nls.net/bounced.htm
> , message quality is 2147483648, quality mask is 0
>
> and put them into some sort of a spreadsheet like structure so I can
> process
> them as stats. Typically I would bring them into a string array where rows
> were the whole line and colums were like
>
> classification
> day
> month
> year
> time
> operation
> message number
> ipaddress
> fromaddress
> toaddress
> cause
> blah blah
>
> and then I can run totals and sort by those totals for each column.
>
> you know.. now that I'm thinking about it this would be a whole lot
> simpler
> if I just dumped the logs into SQLserver and did this in SQL.. But
> then
> I wouldn't be learning file access and arrays in C++...
>
> Ok so I'm open to suggestions, how would you do this if you were a C++
> newbie?
>
> Geo.
>
>
--- BBBS/NT v4.01 Flag-5
* Origin: Barktopia BBS Site http://HarborWebs.com:8081 (1:379/45)SEEN-BY: 633/267 270 5030/786 @PATH: 379/45 1 396/45 106/2000 633/267 |
|
| SOURCE: echomail via fidonet.ozzmosis.com | |
Email questions or comments to sysop@ipingthereforeiam.com
All parts of this website painstakingly hand-crafted in the U.S.A.!
IPTIA BBS/MUD/Terminal/Game Server List, © 2025 IPTIA Consulting™.