Logs parsing in Python

In my previous articles, I have written about the python files basic operation. However, there are more options available in Python to process the files.

Typically, any well-developed application should have associated log files for analysis, monitoring & audit purpose in order to track the application behavior. Usually, the log files are kept running whenever there is an event or trigger happened against the application. So if we want to monitor that kind of log file, we should have interactive file operation in place to read those logs.

Nowadays there are so many tools are available in the market for any kind of application monitoring but its limited to license also.  Say, there were requirements to monitor the particular error pattern in the large log file which kept updated whenever customer perform any kind of banking transactions. Though we had Splunk kind of monitoring tools in place, due to space constraints. It was raised to us to monitor that transaction log file and trigger if that error pattern identified in the log file. To achieve this scenario, we have used FILEINPUT Python module which interactively read the log file and raise email alert to our monitoring team,

How it can be interactive mode!  which means, it can read the running logs and instantly help us to parse the contents of that logs.

>>> import fileinput
>>> for i in fileinput.input():
... print "This is read intractive mode word:- "+i
Apple   --Stdin input given
This is read intractive mode word:- Apple   --Given word displayed here along with print statement
Ball    --Stdin input given  
This is read intractive mode word:- Ball
cat     --Stdin input given
This is read intractive mode word:- cat

Below example would give you more detail on how we are benefited using FILEINPUT module.

>>> import glob -- glob is the module to perform *.* (patterns on file name) search.
>>> import fileinput --fileinput module has to be import in our program to use.
>>> logs = fileinput.input(glob.glob("Python*")) -- created the object for fileinput module and reading all files that start with Python.

Let’s explore sub-modules or objects that associated with this fileinput module.

>>> dir(logs)
['__del__', '__doc__', '__getitem__', '__init__', '__iter__', '__module__', '_backup', '_backupfilename', '_file', '_filelineno', '_filename', '_files', '_inplace', '_isstdin', '_mode', '_openhook', '_output', '_readline', '_savestdout', '_startlineno', 'close', 'filelineno', 'filename', 'fileno', 'isfirstline', 'isstdin', 'lineno', 'next', 'nextfile', 'readline']

I wanted to go back to the requirements of log file monitoring, to monitor the tomcat web server’s Catalina.out log and raise an alert if pattern “Out of Memory” error found in the logs.

So to achieve this requirement, fileinput module would be the best option,

>>> logs = fileinput.input(glob.glob('Catalina.log.*')) --reading the catalina logs
>>> logs.filename() --Unless we start the reading the file, filename module will not capture the filename of the file that is being read
>>> logs.readline()  --Reading the first line of the file
'I am the first line from catalina.out log\n'
>>> logs.filename() --Now i am getting the filename of the file that is currently active.
>>> logs.filelineno() --To know the file line no, at which line the error pattern has found
>>> logs.isfirstline() --To perform any validation like whenever my monitor script read the first line of the file do this.
True  --produce the boolean value
>>> logs.nextfile()  --To start read the next 2nd hour file.
>>> logs.next() -- Read the next line from the file
'This is another first line in second file read\n'
>>> logs.filename()
>>> logs.lineno() --Tell me the line no on this whole operation.
2      --This is the second time reading the content from fileinput object
>>> logs.filelineno() --However this is the first line of the second file, hence the file line no indicate it is 1.
>>> logs.isstdin() --To validate whether the fileinput reading the running log file 

Using this sub-modules & objects, it is an absolutely easy operation to perform pattern searching, parsing on the running logs.

Leave a Reply