SyntaxError: EOL while scanning string literal in Python

Question

In the following code, I am trying to get elements that can be trained on SpaCy NER Model (in the 9th line of code).

from ast import literal_eval
import re

train_data_list = []

for i in range(len(train_data)):
    a = re.search(train_data.subtext[i], train_data.text[i])
    if a is not None:
        element = '("' +train_data.text[i] + '"' + ', {"entities": [(' + 
        str(a.start()) + ',' + str(a.end()) + ',"SKILL")]})'
        train_data_list.append(literal_eval(element))

But I am encountering the following error

 SyntaxError: EOL while scanning string literal

Thanks in Advance.

Look at the text value of element as the time of literal_eval. Fix the code to ensue it is valid: I suspect it might be .. 'funky'. — user2864740
– user2864740, Commented Nov 1, 2018 at 6:17
The text value of train_data consists of continous text. I am encountering problem only in few cases. (I mean while processing certain text values only.) — Ananth Reddy
– Ananth Reddy, Commented Nov 1, 2018 at 6:22
Exactly! Because some of those values result in a string that cannot be parsed with literal_eval. If a specific example is identified the problem should be 'clear'. Include the specific value of element in such failing cases in the question, so that proper solutions can be suggested. — user2864740
– user2864740, Commented Nov 1, 2018 at 6:25
The example when the code fails is when the text value is as follows. \ncreate asset tracking database used for gain/loss profits, facility overhead, and finance research, including\nassisting in the implementation of sap business one. email correspondence, and proposal correspondence (both the\ncreation and assessment of). contract negotiations from customer/client to third party vendors and facilities.\nbuilt solid, transparent client /vendor relationships, with high client/vendor retention. Even in the case where the text has " It worked fine. — Ananth Reddy
– Ananth Reddy, Commented Nov 1, 2018 at 6:28
That's not the full text of element, which would be something like ("...", {"entities": [(...,"SKILL")]}) were the ...'s are "some data". (I was wrong on the " bit - that would be a different error if manifested ^_^.) — user2864740
– user2864740, Commented Nov 1, 2018 at 6:29

Vineeth Sai · Accepted Answer · 2018-11-01 06:26:21Z

2

You cannot split a long line into multiple lines hitting enter. Either change your element= line to a single line like this

element = '("' +train_data.text[i] + '"' + ', {"entities": [(' + str(a.start()) + ',' + str(a.end()) + ',"SKILL")]})'

or add a \ at the end of the line

element = '("' +train_data.text[i] + '"' + ', {"entities": [(' + \
        str(a.start()) + ',' + str(a.end()) + ',"SKILL")]})'

edited Nov 1, 2018 at 6:26

answered Nov 1, 2018 at 6:14

Vineeth Sai

3,4657 gold badges26 silver badges34 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Ananth Reddy Over a year ago

The code I am writing in my notebook is actually in on one line.

Vineeth Sai Over a year ago

Are you copying your code somewhere from the notebook to your local IDLE?

Ananth Reddy Over a year ago

No, I am running in the notebook itself.

Vineeth Sai Over a year ago

I am having no problem by pasting that line into my jupyter notebook. Double check the line that you have in your notebook.

user2864740 · Accepted Answer · 2018-11-01 06:50:07Z

One (or more) of the element strings supplied to literal_eval cannot be parsed by literal_eval.. That is, the program syntax is valid (or else the program would fail without running anything!), and it is one or more of the element values supplied to literal_eval is not valid Python!

The first step is to identify some 'invalid' values, eg.

from ast import literal_eval
import re

train_data_list = []

for i in range(len(train_data)):
    a = re.search(train_data.subtext[i], train_data.text[i])
    if a is not None:
        element = '("' +train_data.text[i] + '"' + ', {"entities": [(' + str(a.start()) + ',' + str(a.end()) + ',"SKILL")]})'
        try:
            data = literal_eval(element)
            train_data_list.append(data)
        except:
            print("Failed to parse element as a Python literal!")
            print(">>")
            print(repr(element))
            print("<<")

If the above "runs" (fsvo. "runs") then the proposed hypothesis holds the non-relevant answers can be ignored ;-)

Anyway, the solution is to not use literal_eval at all. Instead, create an object directly:

for i in range(len(train_data)):
    a = re.search(train_data.subtext[i], train_data.text[i])
    if a is not None:
        # might be a bit off.. YMMV.
        data = (train_data.text[i],
                {"entities": [(str(a.start()), str(a.end()), "SKILL")]})
        train_data_list.append(data)

Now, if values of train_data.text[i] contain a \n - that is, the literal two-character '\' and 'n' escape sequence - there may be additional work required to turn those into newline characters .. but one step at a time. And no step should be backward! :D

Collectives™ on Stack Overflow

SyntaxError: EOL while scanning string literal in Python

2 Answers 2

4 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related