1

I have some log files.i want to convert content of these files to json format using python.required json format is

{
"content":  {
       "text" :      // raw text to be split
},
"metadata";:  {
       ...meta data fields, eg. hostname, logpath,
       other fields passed from client...
     }
}

i tried json dump in python 2.7 but unexpected errors are coming..any suggestion will be great.. thanks..

error I got :

Traceback (most recent call last): 
File "LogToJson.py", line 12, 
in <module> f.write(json.dumps(json.loads(f1), indent=1)) 
File "/usr/lib/python2.7/json/__init__.py", line 338, 
in loads return _default_decoder.decode(s) 
File "/usr/lib/python2.7/json/decoder.py", line 366, 
in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end())
TypeError: expected string or buffer

sample data:

Jan 27 10:46:57 sabya-ThinkPad-T420 NetworkManager[1462]: 
<info> address 9.124.29.61 
Jan 27 10:46:57 sabya-ThinkPad-T420 NetworkManager[1462]: 
<info> prefix 24 (255.255.255.0) 
Jan 27 10:46:57 sabya-ThinkPad-T420 NetworkManager[1462]: 
<info> gateway 9.124.29.1
8
  • 1
    please add the errors you encounter Commented Feb 5, 2016 at 9:02
  • Traceback (most recent call last): File "LogToJson.py", line 12, in <module> f.write(json.dumps(json.loads(f1), indent=1)) File "/usr/lib/python2.7/json/__init__.py", line 338, in loads return _default_decoder.decode(s) File "/usr/lib/python2.7/json/decoder.py", line 366, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) TypeError: expected string or buffer Commented Feb 5, 2016 at 9:09
  • Already tried that answer from a post ..it is showing ValueError: No JSON object could be decoded Commented Feb 5, 2016 at 9:15
  • could you provide sample data? Commented Feb 5, 2016 at 9:24
  • Jan 27 10:46:57 sabya-ThinkPad-T420 NetworkManager[1462]: <info> address 9.124.29.61 Jan 27 10:46:57 sabya-ThinkPad-T420 NetworkManager[1462]: <info> prefix 24 (255.255.255.0) Jan 27 10:46:57 sabya-ThinkPad-T420 NetworkManager[1462]: <info> gateway 9.124.29.1 Commented Feb 5, 2016 at 9:33

2 Answers 2

2

Without code you have written to accomplish your task, it is hard to recommend something. But, from your comments I suppose that you are using json.loads() to read from file, but it works with the python strings in json format only. To read from a file you should use json.load(), but in this case, the contents of the file must be already in json format. So, I suggest to read log file line by line, make some parsing, give it some structure (e.g. create a python dict object with it), and then convert it to json and write it back to new file. You better check this documentation.

Sign up to request clarification or add additional context in comments.

Comments

0

You need to write a parser which can convert your syslog output into a json format. I suggest using re to parse it and use the values in your dict as required.

Example Code:

import re

output = {'content': {}, 'metadata': {} }

parsed_data = re.findall(r'(\w{3} \d+ [\d+:]+) (\S+) (\S+):', 'Jan 27 10:46:57 sabya-ThinkPad-T420 NetworkManager[1462]:')

output['metadata']['time'] = parsed_data[0][0]
output['metadata']['host'] = parsed_data[0][1]
output['metadata']['info'] = parsed_data[0][2]

json.dumps(output)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.