For my upcoming project, I am supposed to take a text file that has been scrambled and unscramble it into a specific format.
Each line in the scrambled file contains the line in the text file, a line number, and a three-letter code that identifies the work. Each of these items is separated by the |character. For example,
it ran away when it saw mine coming!"|164|ALC
cried to the man who trundled the barrow; "bring up alongside and help|27|TRI
"Of course he's stuffed," replied Dorothy, who was still angry.|46|WOO
My task is to write a program that reads each line in the text file, separates and unscrambles the lines, and collects the basic data you’d first set out to collect. For each work, I have to determine
- its longest line (and the corresponding line number),
- its shortest line (and corresponding line number), and
- the average length of the lines in the entire work.
The summaries should be sorted by three-letter code and should be formatted as follows:
ALC
Longest Line (107): "No, I didn’t," said Alice: "I don’t think it’s at all a pity. I said Shortest Line (148): to." Average Length: 59
WOO
Longest Line (66): of my way. Whenever I’ve met a man I’ve been awfully scared; but I just Shortest Line (71): go." Average Length: 58
Then I have to make another file and this file should contain the three-letter code for a work followed by its text. The lines must all be included and should be ordered and should not include line numbers or three-letter codes. The lines should be separated by a separator with five dashes. The result should look like the following:
ALC
A large rose-tree stood near the entrance of the garden: the roses growing on it were white, but there were three gardeners at it, busily painting them red. Alice thought this a very curious thing, and she went nearer to watch them, and just as she came up to them she heard one of them say, "Look out now, Five! Don’t go splashing paint over me like that!" "I couldn’t help it," said Five, in a sulky tone; "Seven jogged my elbow." On which Seven looked up and said, "That’s right, Five! Always lay the blame on others!"
-----
TRI SQUIRE TRELAWNEY, Dr. Livesey, and the rest of these gentlemen having asked me to write down the whole particulars about Treasure Island, from the beginning to the end, keeping nothing back but the bearings of the island, and that only because there is still treasure not yet lifted, I take up my pen in the year of grace 17__ and go back to the time when my father kept the Admiral Benbow inn and the brown old seaman with the sabre cut first took up his lodging under our roof. I remember him as if it were yesterday, as he came plodding to the inn door, his sea-chest following behind him in a hand-barrow--a tall, strong, heavy, nut-brown man, his tarry pigtail falling over the
-----
WOO All this time Dorothy and her companions had been walking through the thick woods. The road was still paved with yellow brick, but these were much covered by dried branches and dead leaves from the trees, and the walking was not at all good. There were few birds in this part of the forest, for birds love the open country where there is plenty of sunshine. But now and then there came a deep growl from some wild animal hidden among the trees. These sounds made the little girl’s heart beat fast, for she did not know what made them; but Toto knew, and he walked close to Dorothy’s side, and did not even bark in return.
My question is, what sort of tools or methods for lists or any other data structures that python has would be the best to use for this project where I have to move lines of texts around and unscramble the order of the words themselves? I would greatly appreciate some advice or help with the code.
@Gesucther
The code you posted works, except when the program tries to find the data for the summaries file it only brings back:
TTL
Longest Line (59): subscribe to our email newsletter to hear about new eBooks. Shortest Line (59): subscribe to our email newsletter to hear about new eBooks. Average Length: 0
WOO
Longest Line (59): subscribe to our email newsletter to hear about new eBooks. Shortest Line (59): subscribe to our email newsletter to hear about new eBooks. Average Length: 0
ALG
Longest Line (59): subscribe to our email newsletter to hear about new eBooks. Shortest Line (59): subscribe to our email newsletter to hear about new eBooks. Average Length: 0
Is there something you can see that's causing the average, and the shortest lines to not print out correctly? Or even the longest line.
Here is a download link to the starting text file. https://drive.google.com/file/d/1Dwnk0ziqovEEuaC7r7YzZdkI5_bh7wvG/view?usp=sharing
EDIT*****
It is working properly now but is there a way to change the code so it outputs the line number where the longest and shortest lines are found? Instead of the character count?
TTL
Longest Line (82): *** END OF THE PROJECT GUTENBERG EBOOK TWENTY THOUSAND LEAGUES UNDER THE SEAS ***
Shortest Line (1): N
Average Length: 58
WOO
Longest Line (74): Section 5. General Information About Project Gutenberg-tm electronic works
Shortest Line (3): it.
Average Length: 58
ALG
Longest Line (76): 2. Alice through Q.’s 3d (_by railwayv) to 4th (_Tweedledum and Tweedledee_)
Shortest Line (1): 1
Average Length: 54
Above, next to longest line it has (76) because it's the character length in the sentence, but is there a way to have it be the line number instead?
EDIT****
It looks like my summary and unscrambled are coming out unalphabetilally? Is there a way to make them come out alphabetical instead?