***
(1) The last part of task4 is the front-end and back-end interaction part. The first two parts are the front-end Foundation (including html, css, JavaScript and vue Foundation) and the web back-end server flash framework. Now you can learn the front-end and back-end interaction process.
(2) Around the news page list, popular recommendation page and news details page, the waterfall flow is displayed: https://www.jianshu.com/p/cea62b6868ce.
1, Item style presentation
The following mainly shows the overall part of the project, which is mainly divided into recommendation page, popular page and news details page.
(due to the audit problem, the following figure is dotted hhh)
2, Backend directory structure
news_rec_sys/ conf/ dao_config.py controller/ dao/ materials/ news_scrapy/ user_proccess/ material_proccess recpocess/ recall/ rank/ online.py offline.py scheduler/ server.py
- conf/dao_config.py: candidate overall profile
- controller /: the interface used to operate the database in the project
- dao /: the entity class of the project, corresponding to the database table
- Materials /: the material part of the project, which is mainly used by users to crawl materials and process user portraits and news portraits
- Reprocess /: the recommendation module of the project, mainly including recall and sorting, as well as some online services and offline processing
- scheduler: the script part of the scheduled task of the project,
- server.py: the entry part of the project backend, mainly including the backend interface part of the project as a whole.
In this project:
(1) The front end mainly uses Vue framework + Mint UI;
(2) The back-end is mainly completed by Flask+Mysql+Mongodb+Redis;
(3) The front and back ends are separated to transfer data through json data format.
The back-end logic of the project is mainly in server Py, which mainly includes functions such as user registration, login, recommendation list, popular list, obtaining news details page and user behavior.
1. User registration login
In order to recommend thousands of people to users, everyone using the system needs to clearly register and log in first, generate a unique user id for each user, and realize the effect of personalized recommendation to users according to the user's historical behavior.
(1) Registration part:
def register(): """User registration""" request_str = request.get_data() request_dict = json.loads(request_str) user = RegisterUser() user.username = request_dict["username"] user.passwd = request_dict["passwd"] # Query whether the current user name has been used result = UserAction().user_is_exist(user, "register") if result != 0: return jsonify({"code": 500, "mgs": "this username is exists"}) user.userid = snowflake.client.get_guid() # The snowflake algorithm generates a unique user id user.age = request_dict["age"] user.gender = request_dict["gender"] user.city = request_dict["city"] save_res = UserAction().save_user(user) # Add the registered user information to mysql if not save_res: return jsonify({"code": 500, "mgs": "register fail."}) return jsonify({"code": 200, "msg": "register success."})
You can see that the above registration part mainly records some basic attributes of some users and writes the user's registration information into the msyql table. Note that in order to prevent the negative impact of user id caused by concurrency, Twitter's snowflake algorithm is used to generate a unique id for each user.
(2) Login section:
@app.route('/recsys/login', methods=["POST"]) def login(): """User login """ request_str = request.get_data() request_dict = json.loads(request_str) user = RegisterUser() user.username = request_dict["username"] user.passwd = request_dict["passwd"] # Query whether the user name or password in the database exists try: result = UserAction().user_is_exist(user, "login") # print(result,"login") if result == 1: return jsonify({"code": 200, "msg": "login success"}) elif result == 2: # Password error return jsonify({"code": 500, "msg": "passwd is error"}) else: return jsonify({"code": 500, "msg": "this username is not exist!"}) except Exception as e: return jsonify({"code": 500, "mgs": "login fail."})
In the user login part, the front end passes the entered account and password to / recsys/login through POST request and through useraction() user_ is_ The exist () method queries whether the user name or password in the database exists. 1 indicates that the account password is correct, 2 indicates that the password is wrong, and 0 indicates that the user does not exist.
2. Recommended page list
In the item style presentation part, the first figure is the style of the recommended page list, which presents the news content in the form of waterfall flow.
@app.route('/recsys/rec_list', methods=["GET"]) def rec_list(): """Recommendation page""" user_name = request.args.get('user_id') page_id = request.args.get('page_id') # Query user id user_id = UserAction().get_user_id_by_name(user_name) if not user_id: return False if user_id is None or page_id is None: return jsonify({"code": 2000, "msg": "user_id or page_id is none!"}) try: # Get recommended list news information rec_news_list = recsys_server.get_rec_list(user_id, page_id) if len(rec_news_list) == 0: return jsonify({"code": 500, "msg": "rec_list data is empty."}) return jsonify({"code": 200, "msg": "request rec_list success.", "data": rec_news_list, "user_id": user_id}) except Exception as e: print(str(e)) return jsonify({"code": 500, "msg": "redis fail."})
The main logic of this part is that the front end requests the "/ recsys/rec_list" interface, the back end obtains the user id from the database through the user name passed by the front end, and then obtains the recommendation list from the recommendation service (recsys_server) according to the user id.
2.1. Obtain user recommendation list
We know that the user's recommendation list is through the get of the recommendation service_ rec_ List (user_id, page_id) interface. Two parameters are required:
- user_id: through the user id, we can find the news list built for the user in redis and return the news information to the front end.
- page_id: locate the location that has been recommended to the user list through the page id, and then go to the new news content from this location.
def get_rec_list(self, user_id, page_id): """Display the display range of a given page user_id You need to use it later when making personalized recommendations""" # Calculate the range of news in redis according to the page id_ ID, assuming that each page displays 10 news s = (int(page_id) - 1) * 10 e = s + 9 # A news is returned_ ID list news_id_list = self.reclist_redis_db.zrange("rec_list", start=s, end=e) # According to news_id gets the specific content of the news and returns a list. The elements in the list are the news information dictionary displayed in order news_info_list = [] news_expose_list = [] for news_id in news_id_list: news_info_dict = self._get_news_simple(news_id) news_info_list.append(news_info_dict) news_expose_list.append(news_info_dict["news_id"]) # Record on the user exposure table [user_exposure] self._save_user_exposure(user_id,news_expose_list) # Exposure meter return news_info_list
The logic here is mainly to calculate the range from the recommendation list in redis according to the page id. After getting the news id list, click_ get_ news_ The simple () method obtains the presentation content required by the news list from mysql and redis.
In order to improve the user experience, the news that has been exposed to the user in the recommendation list will not be exposed to the user through the popular page in the same day. So we need to use it here_ save_ user_ The exposure () method is used to store the exposed news in redis, so that in the popular recommendation, the content of the popular recommendation will be filtered according to the user's exposure.
The returned data format is as follows:
"data": [ { "news_id": "4bfb8aab-bcd8-4c74-b7fd-92b28ca5df69", "cate": "domestic", "read_num": 0, "likes": 0, "collections": 0, "ctime": "2021-11-30 12:07", "title": "The fifth session of the 13th CPPCC Beijing Municipal Committee will be held on January 5, 2022" }, ... { "news_id": "4ded60ac-aa2f-408b-af4d-09ca0c58b50a", "cate": "domestic", "read_num": 6, "likes": 1, "collections": 0, "ctime": "2021-11-30 10:44", "title": "Hu Quanshun, former Secretary of Jiangxi Wanzai county Party committee, was sentenced to 11 years and six months" }]
3. Popular recommendation page
In the popular recommendation page, the front end passes the request '/ recsys / hot'_ List 'interface to obtain the popular news list by passing the user name and current page number. The main logic is the same as that of the second small point. The difference is that popular news information is mainly obtained through get in the recommendation service (recsys_server)_ hot_ List () method to get the list of popular news recommendations.
@app.route('/recsys/hot_list', methods=["GET"]) def hot_list(): """Popular page""" if request.method == "GET": user_name = request.args.get('user_id') page_id = request.args.get('page_id') if user_name is None or page_id is None: return jsonify({"code": 2000, "msg": "user_name or page_id is none!"}) # Query user id user_id = UserAction().get_user_id_by_name(user_name) if not user_id: return False try: # # Get hot list news information rec_news_list = recsys_server.get_hot_list(user_id) if len(rec_news_list) == 0: return jsonify({"code": 200, "msg": "request redis data fail."}) # rec_news_list = recsys_server.get_hot_list(user_id, page_id) return jsonify({"code": 200, "msg": "request hot_list success.", "data": rec_news_list, "user_id": user_id}) except Exception as e: print(str(e)) return jsonify({"code": 2000, "msg": "request hot_list fail."})
You can see that the back-end logic here is similar to the recommendation list, mainly in get_hot_list() and get_ rec_ The difference between list(); The internal details of the popular recommendations will be introduced in detail later, and will not be repeated here.
4. News details page
In the item style display part, the third figure is the style of the news details page. This section mainly contains some news details, including two buttons to collect explicit feedback from users. Users can like and collect feedback content according to their preference for the article.
@app.route('/recsys/news_detail', methods=["GET"]) def news_detail(): """Details of an article""" user_name = request.args.get('user_name') news_id = request.args.get('news_id') user_id = UserAction().get_user_id_by_name(user_name) # if news_id is None or user_id is None: if news_id is None or user_name is None: return jsonify({"code": 2000, "msg": "news_id is none or user_name is none!"}) try: news_detail = recsys_server.get_news_detail(news_id) if UserAction().get_likes_counts_by_user(user_id,news_id) > 0: news_detail["likes"] = True else: news_detail["likes"] = False if UserAction().get_coll_counts_by_user(user_id,news_id) > 0: news_detail["collections"] = True else: news_detail["collections"] = False # print("test",news_detail) return jsonify({"code": 0, "msg": "request news_detail success.", "data": news_detail}) except Exception as e: print(str(e)) return jsonify({"code": 2000, "msg": "error"})
The above is the back-end logic of the details page, which obtains the user id information from mysql through the user name. To prevent the user id or page id from being null, you need to make a judgment. Then through recsys_ Get of server service_ news_ The detail () method obtains the content according to the id of the news.
If the user has clicked like or collect the news before, clicking the like or collect button again should be lit. Therefore, it is also necessary to query whether there are records between the user and the news in mysql, and return the results to the front end for lighting display. Here, two fields like and collections are used to judge whether users have clicked like or favorite the article before through True and False.
The returned data format is as follows:
{ "code": 0, "data": { "news_id": "4ded60ac-aa2f-408b-af4d-09ca0c58b50a", "cate": "entertainment", "title": "......"", "content": "......". ", "collections": true, "read_num": 6, "likes": true, "ctime": "2021-11-30 10:44", "url": "https://news.sina.com.cn/c/2021-11-30/doc-ikyakumx1093113.shtml" }, "msg": "request news_detail success." }
5. User behavior
In this system, users will leave three main user behaviors when watching news:
- One is reading, that is, the user's behavior when clicking on the detailed page of a news;
- Second, like. There will be a like button under the news details page. Users can click the button to trigger the system to record the behavior;
- Third, collection, like behavior, needs to be triggered by the user's initiative.
Therefore, when the user clicks into the details page of a news, the front end will send a request and pass a json format data to the back end:
{ "user_name":"wang", "news_id":"0a745412-db48-4e37-bf13-9a5b56028f7e", "action_time":1638532127190, "action_type":"read" }
When you click the like or favorite button, a request will also be generated and json data will be sent:
//Click like { "user_name":"wang", "news_id":"0a745412-db48-4e37-bf13-9a5b56028f7e", "action_time":1638532127190, "action_type":"like:ture" } //Click collection { "user_name":"wang", "news_id":"0a745412-db48-4e37-bf13-9a5b56028f7e", "action_time":1638532127190, "action_type":"collections:true" }
Through the data transmitted from the front end, the interface corresponding to the back end can record the user behavior through the transmitted parameters:
@app.route('/recsys/action', methods=["POST"]) def actions(): """User behavior: reading, liking, collecting""" request_str = request.get_data() request_dict = json.loads(request_str) username = request_dict.get('user_name') newsid = request_dict.get('news_id') actiontype = request_dict.get("action_type") actiontime = request_dict.get("action_time") userid = UserAction().get_user_id_by_name(username) # Get user id if not userid: return jsonify({"code": 2000, "msg": "user not register"}) action_type_list = actiontype.split(":") if len(action_type_list) == 2: _action_type = action_type_list[0] if action_type_list[1] == "false": # If this parameter is false, it indicates that there are records in the database and data needs to be deleted if _action_type=="likes": UserAction().del_likes_by_user(userid,newsid) # Delete user favorite records elif _action_type=="collections": UserAction().del_coll_by_user(userid,newsid) # Delete user favorite records else: if _action_type=="likes": # If this parameter is true, it means that there are no records in the database and data needs to be added userlikes = UserLikes() userlikes.new(userid,username,newsid) UserAction().save_one_action(userlikes) # Record what users like to record elif _action_type=="collections": usercollections = UserCollections() usercollections.new(userid,username,newsid) UserAction().save_one_action(usercollections) # Record user collection records try: # Fall log logitem = LogItem() logitem.new(userid,newsid,action_type_list[0]) LogController().save_one_log(logitem) # Update the display data news side in redis recsys_server.update_news_dynamic_info(news_id=newsid,action_type=action_type_list) return jsonify({"code": 200, "msg": "action success"}) except Exception as e: print(str(e)) return jsonify({"code": 2000, "msg": "action error"})
There are three main parts in the above code:
(1) User behavior record:
In the data passed from the front end, there is a field "action_type":"like:ture" or "action_type":"like:false" (the collection behavior is similar). For the action_type parameter, its value will be a combined string. The front of the colon indicates the specific behavior of the user, and the back of the colon indicates whether the user clicks like or cancels like (for example, if the user touches it by mistake, it will be cancelled if the user clicks again).
Through true and false, we can not only know whether the current user clicks or cancels, but also know whether the user's behavior record of the news exists in the database. The reason is that when false is passed, it indicates that the state of like changes from true to false, so the record must exist in the database. If true, it indicates that the state of like changes from false to true, indicating that there is no record of the user's behavior on the news in the database. In this way, we can easily operate the database and record the user's behavior.
(2) User behavior log:
In an enterprise, any system will have a log. The most important function is that the log is equivalent to a monitor, which can monitor whether the system fails at any time. Through the log, the possible problems in the system can be located in time.
However, there are some differences between the logs we mentioned here. The logs we mentioned here mainly record some online information through logs. Similar to our system, we need to better understand user interests and make more personalized recommendations by analyzing such user behaviors. Therefore, we can use log to record meaningful user data, analyze the data through log data and build a model.
Why we do this in our news recommendation system:
- Recognize the significance of the log, that is, you can directly obtain some online meaningful user data through the log.
- Log data can help us update some dynamic features in user portraits.
- When building the model later, we can also obtain the modeling of users' click through rate and collection rate, so as to provide a data basis for later work.
In the appeal code, we use the save of LogController()_ one_ The log () method stores the data in mysql.
(3) News dynamic data update
Because we will display the number of readers, likes and favorites of the news, the user's behavior will actually change the three properties of the news. Therefore, we need to update these dynamic data of news in redis.
Mainly through the update in the recommendation service_ news_ dynamic_ Info() method.
def update_news_dynamic_info(self, news_id,action_type): """Update news presentation details""" news_dynamic_info_str = self.dynamic_news_info_redis_db.get("dynamic_news_detail:" + news_id) news_dynamic_info_str = news_dynamic_info_str.replace("'", '"' ) # Replace single quotation marks with double quotation marks news_dynamic_info_dict = json.loads(news_dynamic_info_str) if len(action_type) == 2: if action_type[1] == "true": news_dynamic_info_dict[action_type[0]] +=1 elif action_type[1] == "false": news_dynamic_info_dict[action_type[0]] -=1 else: news_dynamic_info_dict["read_num"] +=1 news_dynamic_info_str = json.dumps(news_dynamic_info_dict) news_dynamic_info_str = news_dynamic_info_str.replace('"', "'" ) res = self.dynamic_news_info_redis_db.set("dynamic_news_detail:" + news_id, news_dynamic_info_str) return res
The above code is mainly the part of news dynamic feature update. It is mainly to obtain the information in redis and update the value of the news attribute according to the behavior transmitted from the front end. After the change, the new results will be stored in redis.
Reference
(1)datawhale notebook