因步骤略繁琐,为了让各位有看下去的动力,先上两首歌,你阔以边听边折腾,这样不会感觉到枯燥。。
需要用到两个开源的项目,分别是:
https://github.com/AlphaReign/www-php # 前端(WEB服务)
https://github.com/AlphaReign/scraper # 爬虫(后端)
你还需要准备一台配置较高的海外VPS或者独立服务器(VPS至少4GB内存,CPU也不能太渣)
以下步骤在Debian10上操作,先安装一些需要用到的软件:
apt -y update apt -y install build-essential curl git unzip python-certbot-nginx nginx mariadb-server
因前端需要PHP7.0,而Debian10的官方存储库内只有7.3,故这里添加sury的源来安装PHP7.0:
apt -y install apt-transport-https ca-certificates lsb-release wget -O /etc/apt/trusted.gpg.d/php.gpg https://packages.sury.org/php/apt.gpg echo "deb https://packages.sury.org/php/ $(lsb_release -sc) main" > /etc/apt/sources.list.d/php.list apt -y update
安装需要用到的PHP7.0软件包:
apt -y install php7.0-common php7.0-cli php7.0-cgi php7.0-fpm \ php7.0-mysql php7.0-sqlite3 php7.0-curl php7.0-mbstring
安装composer:
curl -sS https://getcomposer.org/installer | php mv composer.phar /usr/bin/composer
安装nodejs/pm2/yarn:
curl -sL https://deb.nodesource.com/setup_10.x | bash - apt -y install nodejs npm i -g pm2 npm i -g yarn
安装java/es:
apt -y install gnupg default-jre wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | apt-key add - echo "deb https://artifacts.elastic.co/packages/7.x/apt stable main" | tee -a /etc/apt/sources.list.d/elastic-7.x.list apt -y update apt -y install elasticsearch
初始化MySQL/登录MySQL:
mysql_secure_installation mysql -u root -p
创建一个名为dht的用户/数据库:
CREATE DATABASE dht CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci; CREATE USER 'dht'@'localhost' IDENTIFIED BY '设置你的数据库用户密码'; GRANT ALL PRIVILEGES ON dht.* TO 'dht'@'localhost' WITH GRANT OPTION; FLUSH PRIVILEGES; quit
将之前安装的软件设置开机自启:
systemctl enable nginx systemctl enable mariadb systemctl enable php7.0-fpm systemctl enable elasticsearch
拉取前端源码/安装依赖/修改数据库配置:
cd /opt git clone https://github.com/AlphaReign/www-php.git cd www-php composer install nano index.php
找到如下位置修改数据库连接配置:
define('DBNAME', 'dht'); define('DBUSER', 'dht'); define('DBPASS', '你的数据库用户密码');
拉取后端源码/安装依赖/修改数据库配置:
cd /opt git clone https://github.com/AlphaReign/scraper.git cd scraper yarn yarn migrate nano config/index.js
找到如下位置修改数据库连接配置:
client: 'mysql', connection: { database: 'dht', host: '127.0.0.1', password: '你的数据库用户密码', user: 'dht',
启动后端服务:
pm2 start ecosystem.config.js
新建nginx站点配置文件:
nano /etc/nginx/conf.d/dht.conf
写入如下配置:
server { listen 80; server_name dht.233.fi; # 换成你的域名 index index.html index.htm index.php; root /opt/www-php; client_max_body_size 128g; location / { try_files $uri $uri/ /index.php; } location ~ \.php$ { fastcgi_pass unix:/run/php/php7.0-fpm.sock; fastcgi_index index.php; fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name; include fastcgi_params; } }
重载或者重启nginx:
systemctl reload nginx systemctl restart nginx
现在打开你的网站应该会看到一条这样的信息:Doing some maintenance
稍等片刻,一旦有数据入库刷新网页后应该可以看到登录/注册页面:
注册一个账号进去就可以搜索资源了,爬虫是实时入库的,在左上角可以看到当前数据库内一共有多少种子:
毕竟es加身,搜索结果很精准,搜索速度也很快:
这个爬虫的效率也很不错,一分钟几百个种子,挂它几个月,维护一个自己的种子库还不是美滋滋~
唯一的缺点可能就是对硬件配置要求较高,如果没有闲置机器的话可能要花比较多的钱去买VPS就是了。。
好的好的我去租一台美国Summit超算搞一下
大佬,这个比以前发的手撕包菜、纸上烤鱼那些有优势吗
这个爬虫感觉效率高点。
看得我饿了
看起来挺不错哦
lala大佬,这个音乐播放器是什么搞出来的
插件HermitX播放器
oneprovider 4c 4G的独服搞这个行不行?平时还会用这个下载pt rclone同步gdrive和OneDrive
可以的。
后端这里出错了,yarn migrate一直报错
Error: ER_ACCESS_DENIED_ERROR: Access denied for user ‘dht’@’localhost’ (using password: YES)
at Handshake.Sequence._packetToError (/opt/scraper/node_modules/mysql/lib/protocol/sequences/Sequence.js:47:14)
at Handshake.ErrorPacket (/opt/scraper/node_modules/mysql/lib/protocol/sequences/Handshake.js:123:18)
at Protocol._parsePacket (/opt/scraper/node_modules/mysql/lib/protocol/Protocol.js:291:23)
at Parser._parsePacket (/opt/scraper/node_modules/mysql/lib/protocol/Parser.js:433:10)
步骤写错了,先nano config/index.js配置好数据库连接信息再yarn migrate
完成了,我cac的8h4g服务器差点没跑起来,http://dht.v2ex.ltd/
我这个服务器只有80g,除去4gswap和其他七七八八的,只剩60g了
这还能爬多少个种子
60G硬盘可以爬很多了。
lala,请问我的为什么一直在Doing some maintenance状态,已经一天了
老哥,你的问题解决没?
看看es是不是没启动:systemctl status elasticsearch
没有的自己启动一下:systemctl start elasticsearch
原来如此,不过centos安装这玩意儿真他娘的麻烦
大佬,我cents 7按照这个教程https://www.jianshu.com/p/e49ed6acd7da 安装好了,es但还是一直停留在Doing some maintenance
看官方的文档装吧:https://www.elastic.co/guide/en/elasticsearch/reference/current/rpm.html
或者实在不会试下这个,可以用docker简单很多:https://lala.im/6624.html
大佬,我安装成功了,但是刷新前端,后面并没有什么反应.我用h top查看,好像也没看到pmr2的程序 在运行,但是他又有启动成功的提示
大佬,我的是centos 7.我查看了一下,好像是安装某一个依赖的时候出错了,报错是这样的
[root@PAR-162254 scraper]# yarn migrate
yarn run v1.21.1
$ ./node_modules/.bin/knex migrate:latest
Error: ER_ACCESS_DENIED_ERROR: Access denied for user ‘root’@’localhost’ (using password: YES)
at Handshake.Sequence._packetToError (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/protocol/sequences/Sequence.js:47:14)
at Handshake.ErrorPacket (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/protocol/sequences/Handshake.js:124:18)
at Protocol._parsePacket (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/protocol/Protocol.js:278:23)
at Parser.write (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/protocol/Parser.js:76:12)
at Protocol.write (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/protocol/Protocol.js:38:16)
at Socket. (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/Connection.js:91:28)
at Socket. (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/Connection.js:502:10)
at emitOne (events.js:96:13)
at Socket.emit (events.js:188:7)
at readableAddChunk (_stream_readable.js:176:18)
at Socket.Readable.push (_stream_readable.js:134:10)
at TCP.onread (net.js:559:20)
——————–
at Protocol._enqueue (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/protocol/Protocol.js:144:48)
at Protocol.handshake (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/protocol/Protocol.js:51:23)
at Connection.connect (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/Connection.js:118:18)
at /www/wwwroot/163.172.63.104/scraper/node_modules/knex/lib/dialects/mysql/index.js:95:18
at Promise._execute (/www/wwwroot/163.172.63.104/scraper/node_modules/bluebird/js/release/debuggability.js:303:9)
at Promise._resolveFromExecutor (/www/wwwroot/163.172.63.104/scraper/node_modules/bluebird/js/release/promise.js:483:18)
at new Promise (/www/wwwroot/163.172.63.104/scraper/node_modules/bluebird/js/release/promise.js:79:10)
at Client_MySQL.acquireRawConnection (/www/wwwroot/163.172.63.104/scraper/node_modules/knex/lib/dialects/mysql/index.js:90:12)
at create (/www/wwwroot/163.172.63.104/scraper/node_modules/knex/lib/client.js:280:23)
at tryPromise (/www/wwwroot/163.172.63.104/scraper/node_modules/tarn/lib/Pool.js:366:22)
at tryPromise (/www/wwwroot/163.172.63.104/scraper/node_modules/tarn/lib/utils.js:57:20)
at Promise (/www/wwwroot/163.172.63.104/scraper/node_modules/tarn/lib/Pool.js:366:5)
at callbackOrPromise (/www/wwwroot/163.172.63.104/scraper/node_modules/tarn/lib/Pool.js:357:10)
at Pool._create (/www/wwwroot/163.172.63.104/scraper/node_modules/tarn/lib/Pool.js:307:5)
at Pool._doCreate (/www/wwwroot/163.172.63.104/scraper/node_modules/tarn/lib/Pool.js:275:32)
at Pool._tryAcquireOrCreate (/www/wwwroot/163.172.63.104/scraper/node_modules/tarn/lib/Pool.js:212:12)
error Command failed with exit code 1.
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.[root@PAR-162254 scraper]# yarn migrate
yarn run v1.21.1
$ ./node_modules/.bin/knex migrate:latest
Error: ER_ACCESS_DENIED_ERROR: Access denied for user ‘root’@’localhost’ (using password: YES)
at Handshake.Sequence._packetToError (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/protocol/sequences/Sequence.js:47:14)
at Handshake.ErrorPacket (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/protocol/sequences/Handshake.js:124:18)
at Protocol._parsePacket (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/protocol/Protocol.js:278:23)
at Parser.write (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/protocol/Parser.js:76:12)
at Protocol.write (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/protocol/Protocol.js:38:16)
at Socket. (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/Connection.js:91:28)
at Socket. (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/Connection.js:502:10)
at emitOne (events.js:96:13)
at Socket.emit (events.js:188:7)
at readableAddChunk (_stream_readable.js:176:18)
at Socket.Readable.push (_stream_readable.js:134:10)
at TCP.onread (net.js:559:20)
——————–
at Protocol._enqueue (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/protocol/Protocol.js:144:48)
at Protocol.handshake (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/protocol/Protocol.js:51:23)
at Connection.connect (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/Connection.js:118:18)
at /www/wwwroot/163.172.63.104/scraper/node_modules/knex/lib/dialects/mysql/index.js:95:18
at Promise._execute (/www/wwwroot/163.172.63.104/scraper/node_modules/bluebird/js/release/debuggability.js:303:9)
at Promise._resolveFromExecutor (/www/wwwroot/163.172.63.104/scraper/node_modules/bluebird/js/release/promise.js:483:18)
at new Promise (/www/wwwroot/163.172.63.104/scraper/node_modules/bluebird/js/release/promise.js:79:10)
at Client_MySQL.acquireRawConnection (/www/wwwroot/163.172.63.104/scraper/node_modules/knex/lib/dialects/mysql/index.js:90:12)
at create (/www/wwwroot/163.172.63.104/scraper/node_modules/knex/lib/client.js:280:23)
at tryPromise (/www/wwwroot/163.172.63.104/scraper/node_modules/tarn/lib/Pool.js:366:22)
at tryPromise (/www/wwwroot/163.172.63.104/scraper/node_modules/tarn/lib/utils.js:57:20)
at Promise (/www/wwwroot/163.172.63.104/scraper/node_modules/tarn/lib/Pool.js:366:5)
at callbackOrPromise (/www/wwwroot/163.172.63.104/scraper/node_modules/tarn/lib/Pool.js:357:10)
at Pool._create (/www/wwwroot/163.172.63.104/scraper/node_modules/tarn/lib/Pool.js:307:5)
at Pool._doCreate (/www/wwwroot/163.172.63.104/scraper/node_modules/tarn/lib/Pool.js:275:32)
at Pool._tryAcquireOrCreate (/www/wwwroot/163.172.63.104/scraper/node_modules/tarn/lib/Pool.js:212:12)
error Command failed with exit code 1.
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.
大佬,我前后端都搞定,以后现在数据库我看见有东西了,但是前端刷新不到那个界面,却是什么情况?
大佬,安装成功后,打开了前端网页,注册成功,输入信息登录后,提示:
Oops!
Looks like we ran into a little issue. Please try again later.
这是什么原因?
没遇到过,按道理来说既然都可以注册了,登录进去应该就是正常的才对。
大佬,注册账户后登录时,提示:”Oops! Looks like we ran into a little issue. Please try again later.”
设置displayErrorDetails为true后,提示错误如下:
Slim Application Error
The application could not run because of the following error:
Details
Type: Elasticsearch\Common\Exceptions\BadRequest400Exception
Code: 400
Message: {“error”:{“root_cause”:[{“type”:”query_shard_exception”,”reason”:”No mapping found for [seeders] in order to sort on”,”index_uuid”:”nwVbIYF2QR6t6S5EH9Mb9w”,”index”:”torrents”}],”type”:”search_phase_execution_exception”,”reason”:”all shards failed”,”phase”:”query”,”grouped”:true,”failed_shards”:[{“shard”:0,”index”:”torrents”,”node”:”yWoCXa21QYy3V6LhowHbTg”,”reason”:{“type”:”query_shard_exception”,”reason”:”No mapping found for [seeders] in order to sort on”,”index_uuid”:”nwVbIYF2QR6t6S5EH9Mb9w”,”index”:”torrents”}}]},”status”:400}
File: /opt/www-php/vendor/elasticsearch/elasticsearch/src/Elasticsearch/Connections/Connection.php
Line: 630
应该是elasticsearch的问题,但是不知道怎么解决,大佬有时间可以帮忙看看吗
看样子是elasticsearch服务没起来,你的机器内存有多少,elasticsearch至少给它2G内存才能运行的比较顺。
4G内存,但是systemctl status elasticsearch显示已经启动,这样算起启动了吗?
然后我看了数据库里,seeders的值是NULL
netstat -nltpu看看有没有es的端口在监听,9200/9300之类的,有的话就是启动成功了。
怎么爬到10229就暂停了呀
es或者mysql挂了?这程序比较吃机器配置。
es正常 sql如果跪了应该查询不了了吧
我发现es和sql都是跑着的
大佬,有个问题,安装完成后其他都正常,能登录能搜索,但是页面最下面提示:
Warning: number_format() expects parameter 1 to be float, array given in /opt/www-php/handlers/search/query.php on line 125
然后页面最上方左上角的种子总数一直没变,但是查看数据库,是有种子入库的
我的服务器是centos7 php是用宝塔的
猜测是php版本的问题,是用的7.0吗?
是7.0,直接用宝塔面板安装的,不知道是不是这个原因
您解决了嘛?
emmm
[PM2] Applying action restartProcessId on app [scraper](ids: 0)
[PM2] Applying action restartProcessId on app [loader](ids: 1)
[PM2] [scraper](0) ✓
[PM2] [loader](1) ✓
┌────┬────────────────────┬──────────┬──────┬───────────┬──────────┬──────────┐
│ id │ name │ mode │ ↺ │ status │ cpu │ memory │
├────┼────────────────────┼──────────┼──────┼───────────┼──────────┼──────────┤
│ 1 │ loader │ fork │ 1 │ online │ 0% │ 8.5mb │
│ 0 │ scraper │ fork │ 1 │ online │ 0% │ 25.5mb │
└────┴────────────────────┴──────────┴──────┴───────────┴──────────┴──────────┘
lala 这后端算不算运行起来啦惹 前端也能打开 但是
一直显示Doing some maintenance
emmm 已经解决了 lala 能不能发一下NGINX的伪静态 文件里的在宝塔保存出错
宝塔里的伪静态应该只需要下面这段就行了:
location / {
try_files $uri $uri/ /index.php;
}