Geekpedia Tutorials Home

Building a C# Chat Client and Server

Building a C# Chat Client and ServerA step by step tutorial teaching you how to create your own chat client and chat server easily in C#, for local networks or the Internet.

in C# Programming Tutorials

Getting Hard Drive Information

Getting Hard Drive InformationA C# tutorial showing you how to make use of WMI to extract information on disk drives, such as model, capacity, sectors and serial number.

in C# Programming Tutorials

UPS Shipping Calculator

UPS Shipping CalculatorThis tutorial will teach you how to calculate the shipping cost based on the weight, height, length and depth of the box, the distance and the UPS service type.

in PHP Programming Tutorials

Create Your Own Rich Text Editor

Create Your Own Rich Text EditorCreating a Rich Text Editor using JavaScript is easier to do than you might think, thanks to the support of modern browsers; this tutorial will walk you through it.

in JavaScript Programming Tutorials
Search
Tutorials
Programming Tutorials
IT Jobs
From CareerBuilder

Searching for a string in a File

Let's you find the first occurance of a string within a file.

On Sunday, April 25th 2004 at 12:46 PM
By Sean Eshbaugh (View Profile)
*****   (Rated 4.3 with 23 votes)
Contextual Ads
More C++ Resources
Advertisement

I started working on a project that was to replace the often times buggy and slow (and in my opinion just plain bad) Find Files/Folders function that comes with Windows (windows key + 'F'). In Windows XP the searching utility in the OS seems to be severly lacking in functionality. In previous versions of Windows I used i didn't find there to be too many problems.

The most important part of my project was searching for text within files, something which the Find Files/Folders function claims it can do but it never seems to return results even when I know there should be some. This is what caused me to look for a nice way to search for text inside of a file in much the same way strstr searches for a string inside a larger string. I did find a solution somewhere out there on the web but for reasons I still can't figure out (the code was very messy) it would stop actually looking once you searched through about 16MB worth of files in one run.

Since I could not find anything out there that would allow me to do very extreme amounts of file searching I had to make it myself. What i created is designed to generally be platform independent. Normally I do not write code to be this way because 99.9% of the time I develop things exclusively for Windows.

Because I happen to like them better this will be done with plain old C-style file functions for several reasons:

1. They're MUCH faster than C++ filestreams.
2. The compiled code is MUCH smaller than code using C++ filestreams.
3. They're compatable with old C code.
4. The code looks nicer (to me at least).
5. The project I was working with was using them.
6. They're fun and you should learn to use them.

I'm also going to be using malloc() and free() instead of new and delete. No real reason other than to make this code more C complient even though it is meant to be C++ code. And of course I'll be using C-stlye strings and C-syle string functions, I always do this with code I plan on recycling, because if I ever put it in a DLL I can rest assured programs written in another language will be able to use the function. A program written in VB won't be able to make use of a function inside a DLL that returns a std::string, but it can make use of a function that returns a pointer to a C-style string.

Enough talk, here is the actual code:

unsigned long FileSearch(FILE* pFile, const char* lpszSearchString)
{
    //make sure we were passed a valid, if it isn't return -1
    if ((!pFile)||(!lpszSearchString))
    {
        return -1;
    }

    unsigned long ulFileSize=0;

    //get the size of the file
    fseek(pFile,0,SEEK_END);

    ulFileSize=ftell(pFile);

    fseek(pFile,0,SEEK_SET);

    //if the file is empty return -1
    if (!ulFileSize)
    {
        return -1;
    }

    //get the length of the string we're looking for, this is
    //the size the buffer will need to be
    unsigned long ulBufferSize=strlen(lpszSearchString);

    if (ulBufferSize>ulFileSize)
    {
        return -1;
    }

    //allocate the memory for the buffer
    char* lpBuffer=(char*)malloc(ulBufferSize);

    //if malloc() returned a null pointer (which probably means
    //there is not enough memory) then return -1
    if (!lpBuffer)
    {
        return -1;
    }

    unsigned long ulCurrentPosition=0;

    //this is where the actual searching will happen, what happens
    //here is we set the file pointer to the current position
    //is incrimented by one each pass, then we read the size of
    //the buffer into the buffer and compare it with the string
    //we're searching for, if the string is found we return the
    //position at which it is found
    while (ulCurrentPosition<ulFileSize-ulBufferSize)
    {
        //set the pointer to the current position
        fseek(pFile,ulCurrentPosition,SEEK_SET);

        //read ulBufferSize bytes from the file
        fread(lpBuffer,1,ulBufferSize,pFile);

        //if the data read matches the string we're looking for
        if (!memcmp(lpBuffer,lpszSearchString,ulBufferSize))
        {
            //free the buffer
            free(lpBuffer);

            //return the position the string was found at
            return ulCurrentPosition;
        }
        
        //incriment the current position by one
        ulCurrentPosition++;
    }

    //if we made it this far the string was not found in the file
    //so we free the buffer
    free(lpBuffer);

    //and return -1
    return -1;
}


Just a note, I know the return value is unsigned and in all the error cases I returned -1, remember, -1 is the same as 0xFFFFFFFF in a 32-bit number. Since i sincerly doubt you will ever come across a single file that is over 4GB this should never be a problem. If you should need to search a file that is over 4GB then I suggest replacing "unsigned long" with "unsigned __int64" if your compiler supports it. If you do need to do that then I doubt even more your hard drive can even hold a file that is 2^64 bytes in size so returning -1 (a REALLY big number for 64-bit numbers) will do nicely.

The above code is probably not the most effecient way of doing this, but it works, and it works fast. If i get the time I might try and make this as fast as possible, but unless this becomes the bottleneck of a program I'm working on that might not be for a while.
Digg Digg It!     Del.icio.us Del.icio.us     Reddit Reddit     StumbleUpon StumbleIt     Newsvine Newsvine     Furl Furl     BlinkList BlinkList

Rate Rate this tutorial
Comment Current Comments
by suman on Monday, April 10th 2006 at 10:23 PM

Hi sean,
I am quite intreseted in you work.You have done a very good job.I would like to have full code of this program can u send it to my mail-id please . bsuman256@rediffmail.com is my id..
I would be very thankful to you for sharing your C code of seraching a string in a file.

Thank You very much in advance

Suman Bharath.

by Krishna on Thursday, November 23rd 2006 at 06:46 AM

It si good pice of work. Could u send me the complete copy of the C++ code u have written.My email id being kittu24@gmail.com
Regards
KS

by Muthukumar on Monday, February 5th 2007 at 02:23 AM

Thanks for ur code .If u didnt posted it i too gone for implementing ... this code is really nice thank u.

by vijay on Monday, April 30th 2007 at 10:02 AM

hi, It is really good. Could u send me the full code, to my id, satya.vijai@gmail.com.

Thanks in advance

by Beeteh on Thursday, May 17th 2007 at 06:09 AM

Hi Sean,
Thanks for sharing your code with us. It\'s pretty neat and I found it easy to understand.
If its not too much trouble, could you please send me the full code to skemii@hotmail.com?....you don\'t have to if you don\'t want to though.
Happy programming! & Thanks again!

by Sami on Thursday, August 9th 2007 at 02:14 PM

Hi Sean,
This is excellent. I was looking for something like for a while. We have a issue where I need to find string with file but couldn\'t find anything. Thanks for great work.

I have one question and it might be obvious to all but not me:( how do I run this code if I am searching for string\"RT5004\" with a file and I have over 10,000 plus files.

Thanks,
Sami

by vineeta on Thursday, December 27th 2007 at 08:29 AM

Hi sean

I want to work with file handling.so can u please send me full code ??

waiting for your reply.

Thanks & Regards
Vineeta

by leo on Friday, January 18th 2008 at 08:55 AM

hey man... It is too good. Can u send me the full code to my id !!! :) leoviveke@yahoo.com

Thanks in advance

by Vivek on Friday, February 1st 2008 at 04:10 AM

Your program is absolutely working fine.

But ther are cases where the results are not as required.

For ex: I need to find a string "if" in some file.
The words that contain the word if is also considered valid, which should not be.
i.e "theif" this word contains if... this is also taken into consideration.

can u suggest me a better way to avoid this fact.

by sarma on Friday, March 7th 2008 at 04:37 AM

i need to search for a string in a pdf file using C#.Net and Asp.Net , could u help me please

by Muneeswaran on Monday, March 10th 2008 at 03:37 AM

Hi i try to run your code into MFC application,But there i faced some err like cannot convert parameter 1 from \'struct _iobuf *\' to \'char *\'.so please send a full source code to me

by anand on Friday, March 14th 2008 at 12:51 AM

Hi very fantastic job done yar... Can i get a full source code...?

by pamplemoose on Wednesday, March 19th 2008 at 03:50 PM

Fantastic bit of code, helped me enormously in a project im working on.

I would say change:
while (ulCurrentPosition<ulFileSize-ulBufferSize)
to:
while (ulCurrentPosition<=ulFileSize-ulBufferSize)

without it i was missing the last character of the file off therefore if the required string was there it wasn\'t found.

by Merlin on Monday, April 21st 2008 at 09:48 AM

For purposes of speed here's a version that only reads each character from file once (and thus doesn't need the extra seeks). I'm using it for a true/false on whether the string occurs but the filePos variable has the right value to be returned instead of true to use this as a search for first occurrence method.

The key is to use the same buffer and keep shifting the bytes to make room for more. I use a 2n-1 buffer so that each byte is only moved in memory once.

//Use this signature with obvious changes to find position in file of first occurrence
//static unsigned long findStrInFile(FILE* pFile, const char* str)
inline bool fileContainsStr(FILE* pFile, const char* const str)
{
const unsigned long strLen = strlen(str), strLenM1 = strLen-1;
if( !str || !strLen || !pFile ) { return false; }

if( fseek(pFile, 0, SEEK_END) != 0 ) { return false; }
unsigned long fileLen = ftell(pFile); fseek(pFile, 0, SEEK_SET);

if( !fileLen || strLen > fileLen ) { return false; }

char* const searchBuf = (char*)malloc( 2*strLen - 1 ); if( !searchBuf ) { return false; }
char *pSearch = searchBuf, *pWrite = searchBuf, * const pMid = searchBuf strLenM1;

unsigned long filePos = 0;
fread(searchBuf, strLenM1, 1, pFile); pWrite = strLenM1;
while( 1 )
{
fread(pWrite, 1, 1, pFile); pWrite;

if( !memcmp( pSearch, str, strLen ) ) { free(searchBuf); return true; }

if( filePos > fileLen - strLen ) { break; }

if( pSearch > pMid ) { memcpy(searchBuf, pSearch, strLenM1); pSearch = searchBuf; pWrite = pMid; }
}
free(searchBuf); return false;
}

by Merlin on Monday, April 21st 2008 at 09:52 AM

pMid should be initialized to searchBuf PLUS strLenM1, seems to have dropped the " " on copy or paste somewhere. Also very importan that filePos > fileLen - strLen test should be precedeed by thre prefix increment operator \ \ , " "" ". plus plus. Not sure why my plus signs vanished. pSearch is also pre-incremented before comparison to pMid.

by Merlin on Monday, April 21st 2008 at 09:56 AM

For cur/pasteability

<pre>
For purposes of speed here's a version that only reads each character from file once (and thus doesn't need the extra seeks). I'm using it for a true/false on whether the string occurs but the filePos variable has the right value to be returned instead of true to use this as a search for first occurrence method.

The key is to use the same buffer and keep shifting the bytes to make room for more. I use a 2n-1 buffer so that each byte is only moved in memory once.

//Use this signature with obvious changes to find position in file of first occurrence
//static unsigned long findStrInFile(FILE* pFile, const char* str)
inline bool fileContainsStr(FILE* pFile, const char* const str)
{
const unsigned long strLen = strlen(str), strLenM1 = strLen-1;
if( !str || !strLen || !pFile ) { return false; }

if( fseek(pFile, 0, SEEK_END) != 0 ) { return false; }
unsigned long fileLen = ftell(pFile); fseek(pFile, 0, SEEK_SET);

if( !fileLen || strLen > fileLen ) { return false; }

char* const searchBuf = (char*)malloc( 2*strLen - 1 ); if( !searchBuf ) { return false; }
char *pSearch = searchBuf, *pWrite = searchBuf, * const pMid = searchBuf strLenM1;

unsigned long filePos = 0;
fread(searchBuf, strLenM1, 1, pFile); pWrite = strLenM1;
while( 1 )
{
fread(pWrite, 1, 1, pFile); pWrite;

if( !memcmp( pSearch, str, strLen ) ) { free(searchBuf); return true; }

if( filePos > fileLen - strLen ) { break; }

if( pSearch > pMid ) { memcpy(searchBuf, pSearch, strLenM1); pSearch = searchBuf; pWrite = pMid; }
}
free(searchBuf); return false;
}
</pre>

by naresh on Tuesday, June 3rd 2008 at 05:26 PM

I need your help sir.will you have create search programming in c . you have the send the program my mail id.

Thanking you,

by naresh on Tuesday, June 3rd 2008 at 05:27 PM

I need your help sir.will you have create search programming in c . you have the send the program my mail id.

Thanking you,

by Mac on Sunday, June 8th 2008 at 03:38 PM

I want to parse a log file for text
like say string starting with ABC and ending till first coccurance of ; after that
i want all such string in a seperate file
can you please help
thanks

by Saurabh on Monday, June 30th 2008 at 01:12 AM

Its a really good illustration of string searching.
Can u send me the full source code on saurabh_717@yahoo.co.in
so i can further work on it.

by Mohan on Wednesday, January 28th 2009 at 09:58 AM

Hi, can you send this source code my mail id : msmohanbabu@gmail.com.

Thanks

by dranaxum on Wednesday, February 4th 2009 at 05:15 AM

For string searches use the Knuth-Morris-Pratt algorithm. It's faster!

http://en.wikipedia.org/wiki/Knuth-Morris-Pratt_algorithm

by lightheart on Friday, March 27th 2009 at 07:06 AM

Hi,

@Merlin interesting code, unfortunately I
have some troubles getting it compiled:
When compiling under VS 2008 I get an error in the line: fread(searchBuf, strLenM1, 1, pFile); pWrite = strLenM1;
error C2440: '=' : cannot convert from 'const unsigned long' to 'char *'

and secondly what does the pWrite does after this:
fread(pWrite, 1, 1, pFile); pWrite;

Thanks!

by PRIYADHARSHINI R on Tuesday, June 9th 2009 at 02:39 AM

Nice job Sean. Thanks.

by Kunal on Wednesday, August 5th 2009 at 02:03 PM

REally good code....

by Ronnie on Friday, October 30th 2009 at 07:18 PM

hi!! everyone, can someone help me with this problem. I have been asked to do a function to search for id's of students given a text file which should I use within. Id is string so you must return a id in string.

by nash on Friday, November 20th 2009 at 12:12 PM

This is a very good piece of code. Could you send me the full source at nasthalgic@yahoo.com

Thanks you very much. Great Work!!!

by nash on Friday, November 20th 2009 at 12:12 PM

This is a very good piece of code. Could you send me the full source at nasthalgic@yahoo.com

Thanks you very much. Great Work!!!

by chuborekek on Saturday, January 9th 2010 at 08:04 AM

could you send me also the full source code of it.... thank you sooooooooo much!!! bhedoca@yahoo.com ^^,

by Nilson on Friday, March 5th 2010 at 05:27 AM

This look really great, i'm doing my degree in computer science and this is my first year.would be very helpfull if i could get the full source code(nick_30266@hotmail.com)? Thaaaankssss

by Nilson on Friday, March 5th 2010 at 05:29 AM

This look really great, i'm doing my degree in computer science and this is my first year.would be very helpfull if i could get the full source code(nick_30266@hotmail.com)? Thaaaankssss

by Koen on Wednesday, April 14th 2010 at 12:07 PM

Hello,
Congratulations with this fine piece of code. Would you be so kind to share the code with me?
Thanks a lot in advance

by sargam on Tuesday, April 27th 2010 at 08:26 AM

Hi!!
I need to write a program that writes book data like name, publisher, author, price and stock into a file. Then search the same file for a given book name entered by user. If book is found, display its data, otherwise if out of stock then report the same.
Please help me how to find a particular book name in the file. Rest of the program is done, only problem is searching!!!!
Help!!!!!

by DrAcX on Monday, May 24th 2010 at 08:17 PM

Nice code!
But search may be ending prematurely if string to search for is at the very end of the file.

Perhaps change the condition:
ulCurrentPosition < (ulFileSize - ulBufferSize)
to:
ulCurrentPosition < (ulFileSize - ulBufferSize 1)

by Yasi on Sunday, June 20th 2010 at 01:00 PM

I'm a physics major student. I need to write a program which searches a file and finds a couple of parameter then store them in another file. This code looks extremely helpful to me. Could you please send the full version to me.

Bests,

by Thanh on Monday, July 26th 2010 at 10:31 AM

Hi all. I must write a program by C for Windows that search file in a folder as Windows. Please help me. Thanks all.

by Patt on Tuesday, August 31st 2010 at 01:39 AM

Hey man....I have been trying to write a program which examines each text file in a selected folder for lines containing a particular string.Any such lines are to be appended to a multi-line text box, preceded by their file name. If you could send me the code of which to implement this task to my gmail...Thankyou very much.


Comment Comment on this tutorial
Name: Email:
Message:
Comment Related Tutorials
There are no related tutorials.

Comment Related Source Code
There is no related source code.

Jobs C++ Job Search
My skills include:
Enter a City:

Select a State:


Advanced Search >>
Sponsors
Discover Geekpedia

Other Resources