1

I'm new here and pretty new to programming as well so please be mindful of that.

I am working on building a big database and need images to go with the data I already have in my database. I've got a sql file that looks a bit like this:

CREATE TABLE `processor` (
  `id` int(11) NOT NULL,
  `name` text NOT NULL,
  `Product Collection` text,

) ENGINE=InnoDB DEFAULT CHARSET=utf8;

INSERT INTO `processor` (`id`, `name`, `Product Collection`) VALUES
(361, 'Intel Pentium D Processor 830 (2M Cache, 3.00 GHz, 800 MHz FSB)', 'Legacy Intel Pentium Processor'),
(362, 'Intel Pentium D Processor 840 (2M Cache, 3.20 GHz, 800 MHz FSB)', 'Legacy Intel Pentium Processor'),
(363, 'Intel Pentium D Processor 915 (4M Cache, 2.80 GHz, 800 MHz FSB)', 'Legacy Intel Pentium Processor'),

Now I need to get an image for every single row in my database. So I did some searching and started working with beautifulsoup to search something on google and download an image for it. Although the tutorial I followed wasn't using a seperate file for his search term and as I said, I'm still a newbie with python and beautifulsoup. So I was wondering if I could use my sql file and tell bs to take the name of every row I have, and use that as the keyword to search for in google images. Maybe I could use the id in a for-loop?

for i in range (1, 2642):
    id = i
    keyword = #get the keyword (processor name) that belongs to the id
    i += 1

I know that I might get images that are different from the name of the processor but that's not as big of a deal as this isn't for professional use or anything. It's for my school project so if my scraper downloads a few wrong images, it won't matter that much.

Thanks already!

EDIT:

I tried this (only relevant part of the code):

def run(query, save_directory, num_images=100):
    query = '+'.join(query.split())
    logger.info("Extracting image links")
    images = extract_images(query, num_images)
    logger.info("Downloading images")
    download_images_to_dir(images, save_directory, num_images)
    logger.info("Finished")

def main():
    parser = argparse.ArgumentParser(description='Scrape Google images')
    parser.add_argument('-s', '--search', default= myresult, type=str, help='search term')
    parser.add_argument('-n', '--num_images', default=1, type=int, help='num images to save')
    parser.add_argument('-d', '--directory', default=r'C:\xampp\htdocs\dashboard\IT\GIP\other\ImageDownloader-master\image', type=str, help='save directory')
    args = parser.parse_args()
    run(args.search, args.directory, args.num_images)

if __name__ == '__main__':
    for i in range(1, 2642):
        id = i
        habe = mysql.connector.connect(
        host="localhost",
        user="root",
        passwd="",
        database="habe"
        )
        mycursor = habe.cursor()
        mycursor.execute("SELECT name FROM processor WHERE id=1")
        myresult = mycursor.fetchone()
        i += 1
        main()

But now I'm getting an AttributeError: 'tuple' object has no attribute 'split' on the second line: query = '+'.join(query.split())

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.