Original text: A crawler movie station based on golang
In fact, it can also be said that because you want to change a display template for the original project, you want to use the embed ded feature to rewrite the packaging method of the original static resource file of the project.
Review the code written before to understand the usage of embed and template. thank Big guy blog post (full of dry goods)
Readme.md
Shadow station implemented by golang + redis (low-level crawler). No management background, effect station: film.hzz.cool Support mobile terminal access and playback
Github address
github.com/hezhizheng/go-movies
features
- Using the static resources (html, js, css, etc.) embedded in golang 1.16, the operation only depends on the compiled executable binary and redis
- Support docker startup mode
- Support for simple resource classification and search
- The built-in automatic crawler and the regular task of automatically updating the latest resources basically meet the daily film viewing needs.
- Nailing robot notification
Tip
- Only the API request version is maintained for the time being API interface description.txt Other resources may be added later
- When the API version is started for the first time, a full amount of requests will be made and stored in redis, and then the recently updated film and television resources will be crawled regularly every hour
directory structure
|-- Dockerfile |-- LICENSE.txt |-- config | |-- app.go | `-- app.go.backup #Program configuration file |-- controller #controller layer, basic page rendering | |-- DebugController.go | |-- IndexController.go | `-- SpiderController.go |-- docker-compose.yml |-- go.mod |-- go.sum |-- main.go |-- models # Define some redis query methods | |-- Category.go | |-- Movies.go | `-- readme.md |-- readme.md |-- routes | `-- route.go |-- runner.conf # fresh configuration file |-- services # General business processing class | |-- CategoryService.go | |-- MoviesService.go | `-- readme.md |-- static2 # js, css, image and other static resource folders |-- utils # Some tool classes | |-- Cron.go | |-- Dingrobot.go | |-- Helper.go | |-- JsonUtil.go | |-- Pagination.go | |-- RedisUtil.go | |-- Spider.go | |-- SpiderTask.go | `-- spider # Crawler api version main function code | |-- SpiderTaskPolicy.go | `-- tian_kong | |-- CategoriesStr.go | `-- SpiderApi.go `-- views # html template directory `-- tmpl `-- temp_global_var.go # Define the global variables of embed ded and the functions called by some templates
Home page effect
Use installation (go version > = 1.16)
# download git clone https://github.com/hezhizheng/go-movies # Enter directory cd go-movies # Configuration file (redis db10 library is used by default, and the configuration in app.go can be modified by yourself) cp ./config/app.go.backup ./config/app.go # Configuration description app.spider_path: Crawler routing app.spider_path_name: Crawler route name app.debug_path: debug Routing app.debug_path_name: debug Route name for cron.timing_spider: Timed crawler CRON expression ding.access_token: Nail robot token app.spider_mod: Fixed parameter is TianKongApi app.spider_mod: Development mode is recommended to be set to`true` To avoid modifying static resources, you need to restart the service # Start (the first start will automatically start the crawler task) go run main.go or # Install the fresh tool fresh # If the installation of dependent packages fails, use the agent export GOPROXY=https://goproxy.io,direct or export GOPROXY=https://goproxy.cn,direct visit http://127.0.0.1:8899
Open crawler
- A timed crawler has been built in. After the first full request, the latest updated film and television resources are crawled regularly every hour (you can modify the cron.timing_spider expression in the configuration file to control the interval)
- Active operation: direct access link 127.0.0.1:8899/movies-spider
- Time consuming: the specific time is affected by the response speed of the target website / interface
Tools
- Database redis cache / persistence github.com/Go-redis/redis
- Zset: each classification is an ordered set
- score: timestamp of movie update
- member: the actual URL corresponding to the movie
- Hash: cache of movie specific information (name, cover image, etc.) and data on each page
- Zset: each classification is an ordered set
- route github.com/julienschmidt/httproute...
- json parsing jsoniter github.com/json-iterator/go
- Cross platform packaging: github.com/mitchellh/gox
- web server framework: github.com/valyala/fasthttp
- Profile read: github.com/spf13/viper
- Hot restart: github.com/gravityblast/fresh
Compile executable (cross platform)
# Usage reference https://github.com/mitchellh/gox # The generated file can be executed directly gox -osarch="windows/amd64" -ldflags "-s -w" -gcflags="all=-trimpath=${PWD}" -asmflags="all=-trimpath=${PWD}" gox -osarch="darwin/amd64" -ldflags "-s -w" -gcflags="all=-trimpath=${PWD}" -asmflags="all=-trimpath=${PWD}" gox -osarch="linux/amd64" -ldflags "-s -w" -gcflags="all=-trimpath=${PWD}" -asmflags="all=-trimpath=${PWD}"
- Provides download of compiled files for win64 release
Please ensure that redis is enabled. DB10 is used by default. After successful startup, the crawler will be executed automatically and can be accessed by yourself http://127.0.0.1:8899/movies-spider crawler
Docker deployment (this step can be ignored directly by using docker compose)
# Install redis image (existing can be ignored) sudo docker pull redis:latest # Start redis container # Allocate ports according to actual conditions - p host ports: Container Ports sudo docker run -itd --name redis-test -p 6379:6379 redis # Modify the redis connection address of app.go to the container name "addr":"redis-test" # Compile go movies gox -osarch="linux/amd64" -ldflags "-s -w" -gcflags="all=-trimpath=${PWD}" -asmflags="all=-trimpath=${PWD}" # Construction mirror sudo docker build -t go-movies-docker-scratch . # Start container sudo docker run --link redis-test:redis -p 8899:8899 -d go-movies-docker-scratch
Docker compose one click Start
# Modify the redis connection address of app.go to the container name, which needs to be consistent with that in docker-compose.yml "addr":"redis-test" # Compile go movies gox -osarch="linux/amd64" -ldflags "-s -w" # function sudo docker-compose up -d Open Explorer access http://127.0.0.1:8899 to see the website effect