Day 27 of python learning: summary of the fourth week -- Office Automation

file

By storing data in files through the file system, we realize the persistence of data
Computer file system is a method of storing and organizing computer data, which makes it easy to access and find data
The file system uses the abstract logical concept of file and tree directory to replace the data block concept of physical devices such as hard disk, optical disk and flash memory

Operations on files

Opening and closing of files

Open through the open function, and specify the file name, operation mode, character code and other information through parameters

Operation mode	Specific meaning
'r'	Read (default)
'w'	Write (truncates / overwrites the previous content first)
'x'	Write. If the file already exists, an exception will be generated (create)
'a'	Append to write the content to the end of the existing file
'b'	Binary mode (picture)
't'	Text mode (default)
'+'	Update (both read and write)

When using the open function, you can specify the character encoding used for reading and writing files through the encoding parameter. The common encoding is utf-8

close the file to release the memory occupied by the file

File simple operation

Read the file through the read function

Or read the line one by one through loop and readline
File pointer operation seek function
Operations on csv files:
- Self contained package tool csv
- Write through the writerow method
Write TXT files through write
Download pictures: via
- requests.get() get network data
- Binary data obtained by writing operation records (i.e. picture data)

Excel

Third party Library:

xlrd / xlwt / xlutils - > compatible with lower version Excel files (xls)
openpyxl —> Office 2007+ —> xlsx

Reading and writing operations of Excel

Reading and writing Excel using xlrd and xlwd

Read operation

Through open_workbook() opens an Excel file
Through sheet_names() gets the names of all worksheets
Through sheet_by_name() (named according to the worksheet) and sheet_by_index (named according to the worksheet number) get the specified worksheet
Get the specified number of rows and columns through nrows and ncols respectively
Get the specified column through row() (sequence number of row) and row_slice() (range of row)
Get the specified column through col() (sequence number of column) and col_slice() (range of column)
Select the specified cell through cell and obtain the data of the specified cell through value

Write operation

Create workbook through Workbook
Via add_sheet (create sheet)
Through XFStyle, pattern, pattern SOLID_ Pattern prepare to beautify cells
Through style colour_ Map [] specify cell color
Specify the cell font size through Font()
Specify the row height and column width through width and height
Specify cell text Alignment through Alignment
Record data through write
Save the file through save

Using openpyxl to read and write Excel files

read file

Through load_workbook load file
Get the sheet name through sheetnames
Get the worksheet through the worksheet
Get cell range through dimensions
By max_row and max.colum get the number of rows and columns
max_row and max_column gets the number of rows and columns respectively
Obtain the specified cell through cell (serial number of the specified cell) or [] (specified cell), and obtain the cell data through value

Write operation

Create workbook through Workbook
Through create_sheet() creates a worksheet and is activated through active
Set the cell name through the title
Assign a value to the specified cell through cell
Save the file through save

Gadget datatime

Return the formatted date by entering year, month, day, hour, minute and second. If it is not entered, it is 0 by default
Get the current time of the computer through now()
The time difference object will be obtained by directly making the difference. At this time, the days and seconds of the difference can be obtained through days and seconds
Format time and date through strftime

word

Install third-party library

python-docx —> pip install python-docx pillow
Pilot - > PIL - > Python image library, image processing tool

Reading and writing to Word

Create Word through Document()
Via add_heading() adds a title and controls the title level through the parameter level
Via add Paragragh() adds a paragraph and controls the style through the style parameter
Via add Run() creates text in the specified format
- Bold with hold = True
- By font Size = pt() sets the font size
- Italics are controlled by italic = True
- Control underscores with underline = True
Via add_picture add a picture and control the file size through width
Via add_section() add a semicolon
Via add_table add table
- Control the number of rows and columns of the created table through the parameters rows and cols
- text specifies the input string by [] controlling the position of the input data
- Via add_row() and add_col() to add rows and columns
Via add_page_break() add page break
Save the file through save

mail

smtplib
Header
MIMEMultipart
MIMEText
quote

Writing and sending mail

Obtain authorization code
- Login mailbox POP3/SMTP/IMAP
- POP3/SMTP service has been turned off and on (click Start)
- The authorization code is obtained after the code scanning application is successful

import smtplib
from email.header import Header
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from urllib.parse import quote


def create_attachment(path, filename):
    with open(f'{path}/{filename}', 'rb') as file:
        attachment = MIMEText(file.read(), 'base64', 'utf-8')
        # Specify content type
        attachment['content-type'] = 'application/octet-stream'
        # Process file names to percent encoding
        filename = quote(filename)
        # Specifies how content is disposed of
        attachment['content-disposition'] = f'attachment; filename="{filename}"'
    return attachment


# Create mail body object
email = MIMEMultipart()
# Set sender, recipient, and principal
email['From'] = Header('Your personal email')
email['To'] = Header('The person you want to send to is number 1;The person you want to send to is number 2')
email['Subject'] = Header('Please check the resignation certificate', 'utf-8')
# Add message body content
content = """According to German media reports, members of the German train drivers' Union voted on the 9th local time,
A national strike is scheduled to take place on the 10th local time, and the strike in freight transportation began at 19:00 local time on the 10th.
Since then, from 2 a.m. on the 11th to 2 a.m. on the 13th, there will be a 48 hour strike on passenger transport and railway infrastructure across Germany."""
email.attach(MIMEText(content, 'plain', 'utf-8'))
# Add an attachment
email.attach(create_attachment('resources', 'leaving certificate.docx'))
# Add another attachment
email.attach(create_attachment('resources', 'Alibaba stock data in 2020.xlsx'))

# Create SMTP_SSL object (connect to mail server)
# SMTP server, because I use Netease's 126 mailbox
smtp_obj = smtplib.SMTP_SSL('smtp.126.com', 465)
# Login through user name and authorization code
smtp_obj.login('Personal email', 'The authorization code application steps are as follows')
# Send mail (sender, recipient, mail content (string))
smtp_obj.sendmail(
    'Personal email',
    ['The person you want to send to is number 1', 'The person you want to send to is number 2'],
    email.as_string()
)

Send SMS

# Send SMS via SMS platform

import random
import requests


def send_message_by_luosimao(tel, message):
    """Send SMS (call screw cap SMS Gateway)"""
    resp = requests.post(
        url='http://sms-api.luosimao.com/v1/send.json',
        auth=('api', 'key-20553a237c74f641d7cc3cfd412771a1'),
        data={
            'mobile': tel,
            'message': message
        },
        timeout=10,
        verify=False
    )
    return resp.json()


def gen_mobile_code(length=6):
    """Generate mobile phone verification code of specified length"""
    return ''.join(random.choices('0123456789', k=length))


def main():
    code = gen_mobile_code()
    message = f'Your SMS verification code is{code}，You can't tell anyone! [ python Course]'
    print(send_message_by_luosimao('cell-phone number', message))


if __name__ == '__main__':
    main()

regular expression

Regular expression - > pattern - > pattern of matching string - > complex matching rules
Character set - > [] - > [a-za-z0-9_] {6,20} —> \w{6,20} —> ^\w{6,20}$
Python uses regular expressions in two ways:
- The regular expression object is not created, and the function is called directly for matching operation
  - match
  - fullmatch
- Create a regular expression object (Pattern) and realize the matching operation by sending a message to the object
  - compile

Application method of regular expression

Match - match - match from scratch - > match object - > group()
search - search - match from anywhere - > match object - > group()
findall - find out everything that matches the regular expression from the string - > list [STR]

example:

Check whether the user name is legal

# For website registration, the user name must be letters, numbers and underscores, with a length of 6 to 20 characters
# Check whether the user name is legal
import re

username = input('enter one user name: ')
# Create a Pattern object by compiling regular expressions in compile
username_pattern = re.compile(r'^\w{6,20}$')
print(type(username_pattern))
# Match checking is achieved by sending a message to the Pattern object
matcher = username_pattern.match(username)
print(type(matcher))
if matcher is None:
    print('Invalid user name!!!')
else:
    print(matcher.group())

Extract the part that matches the regular expression from the string

import re

content = """Police call: 110, our class is Python-2105 Ben,
my QQ The number is 1234567. My mobile phone number is 13811223344. Thank you!"""


pattern = re.compile(r'\d+')
matcher = pattern.search(content)
while matcher:
    print(matcher.group())
    print(matcher.start(), matcher.end())
    matcher = pattern.search(content, matcher.end())

results = pattern.findall(content)
for result in results:
    print(result)

results = re.findall(r'\d+', content)
for result in results:
    print(result)

Get news headlines and links from web pages

import re

import requests

resp = requests.get('https://www.sohu.com/')
content = resp.text

pattern1 = re.compile(r'href="http.+?"')
matcher = pattern1.search(content)
while matcher:
    print(matcher.group()[6:-1])
    matcher = pattern1.search(content, matcher.end())

pattern2 = re.compile(r'title=".+?"')
titles_list = pattern2.findall(content)
for title in titles_list:
    print(title[7:-1])

Regular expression capture group

import re

import requests

# Matches the entire a tag, but captures only the contents in () -- > the capture group of the regular expression
pattern = re.compile(r'<a\s.*?href="(.+?)".*?title="(.+?)".*?>')
resp = requests.get('https://www.sohu.com/')
results = pattern.findall(resp.text)
for href, title in results:
    print(title)
    print(href)

Bad content filtering

import re

content = 'Xiao Ming is a sand carving pen, FUck you!'
pattern = re.compile(r'[Silly sand shark][Hanging pen carving]|Ma Xiaoming|fuck|shit', flags=re.IGNORECASE)
# modified_content = re.sub(r '[silly Shasha] [forced pen carving base] | Ma Huateng | fuck | hit', '*', content, flags = re. I)
modified_content = pattern.sub('*', content)
print(modified_content)

Split string

import re

poem = 'The bright moon in front of the window is suspected to be frost on the ground. Raising my head, I see the moon so bright; withdrawing my eyes, my nostalgia comes around.'
pattern = re.compile(r'[，. ]')
sentences_list = pattern.split(poem)
print(sentences_list)
sentences_list = [sentence for sentence in sentences_list if sentence]
print(sentences_list)
for sentence in sentences_list:
    print(sentence)

ub('*', content)
print(modified_content)

#### Split string

```python
import re

poem = 'The bright moon in front of the window is suspected to be frost on the ground. Raising my head, I see the moon so bright; withdrawing my eyes, my nostalgia comes around.'
pattern = re.compile(r'[，. ]')
sentences_list = pattern.split(poem)
print(sentences_list)
sentences_list = [sentence for sentence in sentences_list if sentence]
print(sentences_list)
for sentence in sentences_list:
    print(sentence)

Keywords: Python Excel regex csv

Added by bbreslauer on Sun, 26 Dec 2021 22:27:59 +0200

Programming VIP

Day 27 of python learning: summary of the fourth week -- Office Automation