Day 27 of python learning: summary of the fourth week -- Office Automation
file
-
By storing data in files through the file system, we realize the persistence of data
-
Computer file system is a method of storing and organizing computer data, which makes it easy to access and find data
-
The file system uses the abstract logical concept of file and tree directory to replace the data block concept of physical devices such as hard disk, optical disk and flash memory
Operations on files
Opening and closing of files
- Open through the open function, and specify the file name, operation mode, character code and other information through parameters
Operation mode | Specific meaning |
---|---|
'r' | Read (default) |
'w' | Write (truncates / overwrites the previous content first) |
'x' | Write. If the file already exists, an exception will be generated (create) |
'a' | Append to write the content to the end of the existing file |
'b' | Binary mode (picture) |
't' | Text mode (default) |
'+' | Update (both read and write) |
When using the open function, you can specify the character encoding used for reading and writing files through the encoding parameter. The common encoding is utf-8
- close the file to release the memory occupied by the file
File simple operation
-
Read the file through the read function
Or read the line one by one through loop and readline
-
File pointer operation seek function
-
Operations on csv files:
- Self contained package tool csv
- Write through the writerow method
-
Write TXT files through write
-
Download pictures: via
- requests.get() get network data
- Binary data obtained by writing operation records (i.e. picture data)
Excel
Third party Library:
- xlrd / xlwt / xlutils - > compatible with lower version Excel files (xls)
- openpyxl —> Office 2007+ —> xlsx
Reading and writing operations of Excel
Reading and writing Excel using xlrd and xlwd
Read operation
- Through open_workbook() opens an Excel file
- Through sheet_names() gets the names of all worksheets
- Through sheet_by_name() (named according to the worksheet) and sheet_by_index (named according to the worksheet number) get the specified worksheet
- Get the specified number of rows and columns through nrows and ncols respectively
- Get the specified column through row() (sequence number of row) and row_slice() (range of row)
- Get the specified column through col() (sequence number of column) and col_slice() (range of column)
- Select the specified cell through cell and obtain the data of the specified cell through value
Write operation
- Create workbook through Workbook
- Via add_sheet (create sheet)
- Through XFStyle, pattern, pattern SOLID_ Pattern prepare to beautify cells
- Through style colour_ Map [] specify cell color
- Specify the cell font size through Font()
- Specify the row height and column width through width and height
- Specify cell text Alignment through Alignment
- Record data through write
- Save the file through save
Using openpyxl to read and write Excel files
read file
- Through load_workbook load file
- Get the sheet name through sheetnames
- Get the worksheet through the worksheet
- Get cell range through dimensions
- By max_row and max.colum get the number of rows and columns
- max_row and max_column gets the number of rows and columns respectively
- Obtain the specified cell through cell (serial number of the specified cell) or [] (specified cell), and obtain the cell data through value
Write operation
- Create workbook through Workbook
- Through create_sheet() creates a worksheet and is activated through active
- Set the cell name through the title
- Assign a value to the specified cell through cell
- Save the file through save
Gadget datatime
- Return the formatted date by entering year, month, day, hour, minute and second. If it is not entered, it is 0 by default
- Get the current time of the computer through now()
- The time difference object will be obtained by directly making the difference. At this time, the days and seconds of the difference can be obtained through days and seconds
- Format time and date through strftime
word
Install third-party library
python-docx —> pip install python-docx pillow
Pilot - > PIL - > Python image library, image processing tool
Reading and writing to Word
- Create Word through Document()
- Via add_heading() adds a title and controls the title level through the parameter level
- Via add Paragragh() adds a paragraph and controls the style through the style parameter
- Via add Run() creates text in the specified format
- Bold with hold = True
- By font Size = pt() sets the font size
- Italics are controlled by italic = True
- Control underscores with underline = True
- Via add_picture add a picture and control the file size through width
- Via add_section() add a semicolon
- Via add_table add table
- Control the number of rows and columns of the created table through the parameters rows and cols
- text specifies the input string by [] controlling the position of the input data
- Via add_row() and add_col() to add rows and columns
- Via add_page_break() add page break
- Save the file through save
smtplib
Header
MIMEMultipart
MIMEText
quote
Writing and sending mail
- Obtain authorization code
- Login mailbox POP3/SMTP/IMAP
- POP3/SMTP service has been turned off and on (click Start)
- The authorization code is obtained after the code scanning application is successful
import smtplib from email.header import Header from email.mime.multipart import MIMEMultipart from email.mime.text import MIMEText from urllib.parse import quote def create_attachment(path, filename): with open(f'{path}/{filename}', 'rb') as file: attachment = MIMEText(file.read(), 'base64', 'utf-8') # Specify content type attachment['content-type'] = 'application/octet-stream' # Process file names to percent encoding filename = quote(filename) # Specifies how content is disposed of attachment['content-disposition'] = f'attachment; filename="{filename}"' return attachment # Create mail body object email = MIMEMultipart() # Set sender, recipient, and principal email['From'] = Header('Your personal email') email['To'] = Header('The person you want to send to is number 1;The person you want to send to is number 2') email['Subject'] = Header('Please check the resignation certificate', 'utf-8') # Add message body content content = """According to German media reports, members of the German train drivers' Union voted on the 9th local time, A national strike is scheduled to take place on the 10th local time, and the strike in freight transportation began at 19:00 local time on the 10th. Since then, from 2 a.m. on the 11th to 2 a.m. on the 13th, there will be a 48 hour strike on passenger transport and railway infrastructure across Germany.""" email.attach(MIMEText(content, 'plain', 'utf-8')) # Add an attachment email.attach(create_attachment('resources', 'leaving certificate.docx')) # Add another attachment email.attach(create_attachment('resources', 'Alibaba stock data in 2020.xlsx')) # Create SMTP_SSL object (connect to mail server) # SMTP server, because I use Netease's 126 mailbox smtp_obj = smtplib.SMTP_SSL('smtp.126.com', 465) # Login through user name and authorization code smtp_obj.login('Personal email', 'The authorization code application steps are as follows') # Send mail (sender, recipient, mail content (string)) smtp_obj.sendmail( 'Personal email', ['The person you want to send to is number 1', 'The person you want to send to is number 2'], email.as_string() )
Send SMS
# Send SMS via SMS platform import random import requests def send_message_by_luosimao(tel, message): """Send SMS (call screw cap SMS Gateway)""" resp = requests.post( url='http://sms-api.luosimao.com/v1/send.json', auth=('api', 'key-20553a237c74f641d7cc3cfd412771a1'), data={ 'mobile': tel, 'message': message }, timeout=10, verify=False ) return resp.json() def gen_mobile_code(length=6): """Generate mobile phone verification code of specified length""" return ''.join(random.choices('0123456789', k=length)) def main(): code = gen_mobile_code() message = f'Your SMS verification code is{code},You can't tell anyone! [ python Course]' print(send_message_by_luosimao('cell-phone number', message)) if __name__ == '__main__': main()
regular expression
- Regular expression - > pattern - > pattern of matching string - > complex matching rules
- Character set - > [] - > [a-za-z0-9_] {6,20} —> \w{6,20} —> ^\w{6,20}$
- Python uses regular expressions in two ways:
- The regular expression object is not created, and the function is called directly for matching operation
- match
- fullmatch
- Create a regular expression object (Pattern) and realize the matching operation by sending a message to the object
- compile
- The regular expression object is not created, and the function is called directly for matching operation
Application method of regular expression
Match - match - match from scratch - > match object - > group()
search - search - match from anywhere - > match object - > group()
findall - find out everything that matches the regular expression from the string - > list [STR]
example:
Check whether the user name is legal
# For website registration, the user name must be letters, numbers and underscores, with a length of 6 to 20 characters # Check whether the user name is legal import re username = input('enter one user name: ') # Create a Pattern object by compiling regular expressions in compile username_pattern = re.compile(r'^\w{6,20}$') print(type(username_pattern)) # Match checking is achieved by sending a message to the Pattern object matcher = username_pattern.match(username) print(type(matcher)) if matcher is None: print('Invalid user name!!!') else: print(matcher.group())
Extract the part that matches the regular expression from the string
import re content = """Police call: 110, our class is Python-2105 Ben, my QQ The number is 1234567. My mobile phone number is 13811223344. Thank you!""" pattern = re.compile(r'\d+') matcher = pattern.search(content) while matcher: print(matcher.group()) print(matcher.start(), matcher.end()) matcher = pattern.search(content, matcher.end()) results = pattern.findall(content) for result in results: print(result) results = re.findall(r'\d+', content) for result in results: print(result)
Get news headlines and links from web pages
import re import requests resp = requests.get('https://www.sohu.com/') content = resp.text pattern1 = re.compile(r'href="http.+?"') matcher = pattern1.search(content) while matcher: print(matcher.group()[6:-1]) matcher = pattern1.search(content, matcher.end()) pattern2 = re.compile(r'title=".+?"') titles_list = pattern2.findall(content) for title in titles_list: print(title[7:-1])
Regular expression capture group
import re import requests # Matches the entire a tag, but captures only the contents in () -- > the capture group of the regular expression pattern = re.compile(r'<a\s.*?href="(.+?)".*?title="(.+?)".*?>') resp = requests.get('https://www.sohu.com/') results = pattern.findall(resp.text) for href, title in results: print(title) print(href)
Bad content filtering
import re content = 'Xiao Ming is a sand carving pen, FUck you!' pattern = re.compile(r'[Silly sand shark][Hanging pen carving]|Ma Xiaoming|fuck|shit', flags=re.IGNORECASE) # modified_content = re.sub(r '[silly Shasha] [forced pen carving base] | Ma Huateng | fuck | hit', '*', content, flags = re. I) modified_content = pattern.sub('*', content) print(modified_content)
Split string
import re poem = 'The bright moon in front of the window is suspected to be frost on the ground. Raising my head, I see the moon so bright; withdrawing my eyes, my nostalgia comes around.' pattern = re.compile(r'[,. ]') sentences_list = pattern.split(poem) print(sentences_list) sentences_list = [sentence for sentence in sentences_list if sentence] print(sentences_list) for sentence in sentences_list: print(sentence)
ub('*', content)
print(modified_content)
#### Split string ```python import re poem = 'The bright moon in front of the window is suspected to be frost on the ground. Raising my head, I see the moon so bright; withdrawing my eyes, my nostalgia comes around.' pattern = re.compile(r'[,. ]') sentences_list = pattern.split(poem) print(sentences_list) sentences_list = [sentence for sentence in sentences_list if sentence] print(sentences_list) for sentence in sentences_list: print(sentence)