With the advent of the Internet age, people are more inclined to Internet shopping. A treasure is also a giant in the e-commerce industry. There are many business data in a treasure platform.

Today, I will take you to use python+selenium tool to obtain these public

Suitable for:

Python zero foundation, interested in reptile data collection students!

Environment introduction:

python 3.6

1. Install selenium module

pip install selenium


2. Request web address

if __name__ == '__main__':
    keyword = input('Please enter the product data you want to query:')
    driver = webdriver.Chrome()




3. Log in to Taobao account and search for products

def search_product(key):
    """Simulate searching products to get the maximum number of pages"""
    driver.find_element_by_id('q').send_keys(key)  # according to id Value find search box enter key
    driver.find_element_by_class_name('btn-search').click()  # Click search case
    driver.maximize_window()  # maximize window

 page = driver.find_element_by_xpath('//*[@id="mainsrp-pager"]/div/div/div/div[1]')  # Get a label for the number of pages
    page = page.text  # Extract label text
    page = re.findall('(\d+)', page)[0]  
    # print(page)
    return int(page)





4. Get product data

def get_product():
    divs = driver.find_elements_by_xpath('//div[@class="items"]/div[@class="item J_MouserOnverReq  "]')
    for div in divs:
        info = div.find_element_by_xpath('.//div[@class="row row-2 title"]/a').text  # Commodity name
        price = div.find_element_by_xpath('.//strong').text + 'element'  # commodity price
        deal = div.find_element_by_xpath('.//div[@class="deal-cnt"]').text  # Number of payers
        name = div.find_element_by_xpath('.//div[@class="shop"]/a').text  # Shop name
        print(info, price, deal, name, sep='|')
        with open('data.csv', 'a', newline='') as csvfile:  # newline=''  Specify write line by line
            csvwriter = csv.writer(csvfile, delimiter=',')  # delimiter=','  csv Separator for data
            csvwriter.writerow([info, price, deal, name])  # Serialized data, writing csv
def main():
    page = get_product()





Added by arie_parie on Wed, 06 May 2020 17:36:24 +0300