regex - removing weird double quotes (from excel file) in python string -


I am loading in Extile File using xlrd in Python 3. They are basically the lines of text in the spreadsheet, some quotes on these lines are, for example, a line can be:

She said, "My name is Jennifer."

When I'm reading them in Python and making them in the wire, then double quotes read as a strange double quote character, which looks like double quote in italics. I am convinced that somewhere on the way, Ajithan was described as some foreign alphabet, rather than some double quotation marks for some encoding issues or for some reason. So in the above example, if I provide that line as "text", then we would have the following like this (although in reality I do not actually type the line, so imagine that the "text" first Was already entrusted):

  text = 'She said,' My name is Jennifer. '' Text [10] = '' ''  

The second line will spit because it appears to be unidentified as a normal double coat character. I am working within the Mac terminal, if there are differences.

My questions are: 1. Is there any way to strip these funny double quotation marks easily? 2. Is there any way when I read the python file to recognize as double quote?

I assume that somewhere on the way, As read

Yes; In fact the file data actually represents.

uble citations due to some encoding issues or for some reasons.

There is no problem with the encoding the actual character is not a "real double quote".

Is there any way to strip these strange double quotes easily?

You . Replace with the method of the strings as you normally would, either replace them with "real double quote" or something.

Is there any way when I get the Python to recognize the python as the two-dimensional quotation marks?

If you are looking for them, then you can compare them to the characters they really are.

As mentioned in the comment, they are most likely and They are used because the opening and closing quotes can see different directions (by turning them into different directions), which normally makes beautiful typography (like using " , Which is simply more convenient for programmers). They are included in Python with Unicode Escape:

  text [10] == '\ u201c'  

You even asked Python directly for this information, text [10] on the dragon command line (which will evaluate and show you representation), or explicitly in a script like print (repr (text [10]) . .

< / Div>

Comments

Popular posts from this blog

mysql - How to enter php data into a html multiple select box -

java - Can't add JTree to JPanel of a JInternalFrame -

c++ - Cassandra datastax cpp driver - avoiding unnecessary copies -