静看光阴荏苒
不管不顾不问不说也不念

自建一个私有DHT搜索引擎(磁力/种子搜索)

因步骤略繁琐,为了让各位有看下去的动力,先上两首歌,你阔以边听边折腾,这样不会感觉到枯燥。。

需要用到两个开源的项目,分别是:

https://github.com/AlphaReign/www-php # 前端(WEB服务)
https://github.com/AlphaReign/scraper # 爬虫(后端)

你还需要准备一台配置较高的海外VPS或者独立服务器(VPS至少4GB内存,CPU也不能太渣)

以下步骤在Debian10上操作,先安装一些需要用到的软件:

apt -y update
apt -y install build-essential curl git unzip python-certbot-nginx nginx mariadb-server

因前端需要PHP7.0,而Debian10的官方存储库内只有7.3,故这里添加sury的源来安装PHP7.0:

apt -y install apt-transport-https ca-certificates lsb-release
wget -O /etc/apt/trusted.gpg.d/php.gpg https://packages.sury.org/php/apt.gpg
echo "deb https://packages.sury.org/php/ $(lsb_release -sc) main" > /etc/apt/sources.list.d/php.list
apt -y update

安装需要用到的PHP7.0软件包:

apt -y install php7.0-common php7.0-cli php7.0-cgi php7.0-fpm \
php7.0-mysql php7.0-sqlite3 php7.0-curl php7.0-mbstring

安装composer:

curl -sS https://getcomposer.org/installer | php
mv composer.phar /usr/bin/composer

安装nodejs/pm2/yarn:

curl -sL https://deb.nodesource.com/setup_10.x | bash -
apt -y install nodejs 
npm i -g pm2
npm i -g yarn

安装java/es:

apt -y install gnupg default-jre
wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | apt-key add -
echo "deb https://artifacts.elastic.co/packages/7.x/apt stable main" | tee -a /etc/apt/sources.list.d/elastic-7.x.list
apt -y update
apt -y install elasticsearch

初始化MySQL/登录MySQL:

mysql_secure_installation
mysql -u root -p

创建一个名为dht的用户/数据库:

CREATE DATABASE dht CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
CREATE USER 'dht'@'localhost' IDENTIFIED BY '设置你的数据库用户密码';
GRANT ALL PRIVILEGES ON dht.* TO 'dht'@'localhost' WITH GRANT OPTION;
FLUSH PRIVILEGES;
quit

将之前安装的软件设置开机自启:

systemctl enable nginx
systemctl enable mariadb
systemctl enable php7.0-fpm
systemctl enable elasticsearch

拉取前端源码/安装依赖/修改数据库配置:

cd /opt
git clone https://github.com/AlphaReign/www-php.git
cd www-php
composer install
nano index.php

找到如下位置修改数据库连接配置:

define('DBNAME', 'dht');
define('DBUSER', 'dht');
define('DBPASS', '你的数据库用户密码');

拉取后端源码/安装依赖/修改数据库配置:

cd /opt
git clone https://github.com/AlphaReign/scraper.git
cd scraper
yarn
yarn migrate
nano config/index.js

找到如下位置修改数据库连接配置:

                client: 'mysql',
                connection: {
                        database: 'dht',
                        host: '127.0.0.1',
                        password: '你的数据库用户密码',
                        user: 'dht',

启动后端服务:

pm2 start ecosystem.config.js

新建nginx站点配置文件:

nano /etc/nginx/conf.d/dht.conf

写入如下配置:

server {
    listen       80;
    server_name  dht.233.fi; # 换成你的域名
    index        index.html index.htm index.php;
    root         /opt/www-php;
    client_max_body_size 128g;

    location / {
        try_files $uri $uri/ /index.php;
    }

    location ~ \.php$ {
        fastcgi_pass   unix:/run/php/php7.0-fpm.sock;
        fastcgi_index  index.php;
        fastcgi_param  SCRIPT_FILENAME  $document_root$fastcgi_script_name;
        include        fastcgi_params;
    }
}

重载或者重启nginx:

systemctl reload nginx
systemctl restart nginx

现在打开你的网站应该会看到一条这样的信息:Doing some maintenance

稍等片刻,一旦有数据入库刷新网页后应该可以看到登录/注册页面:

注册一个账号进去就可以搜索资源了,爬虫是实时入库的,在左上角可以看到当前数据库内一共有多少种子:

毕竟es加身,搜索结果很精准,搜索速度也很快:

这个爬虫的效率也很不错,一分钟几百个种子,挂它几个月,维护一个自己的种子库还不是美滋滋~

唯一的缺点可能就是对硬件配置要求较高,如果没有闲置机器的话可能要花比较多的钱去买VPS就是了。。

赞(5)
未经允许不得转载:荒岛 » 自建一个私有DHT搜索引擎(磁力/种子搜索)
分享到: 更多 (0)

评论 32

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址
  1. #1

    好的好的我去租一台美国Summit超算搞一下 :idea:

    橘子2个月前 (12-17) Google Chrome 76.0.3809.80 Google Chrome 76.0.3809.80 Windows 10 x64 Edition Windows 10 x64 Edition回复
  2. #2

    大佬,这个比以前发的手撕包菜、纸上烤鱼那些有优势吗

    蛤蛤2个月前 (12-18) Google Chrome 75.0.3770.100 Google Chrome 75.0.3770.100 Windows 7 x64 Edition Windows 7 x64 Edition回复
    • 这个爬虫感觉效率高点。

      LALA2个月前 (12-20) Google Chrome 74.0.3729.169 Google Chrome 74.0.3729.169 Windows 10 x64 Edition Windows 10 x64 Edition回复
    • 看得我饿了

      橘子2个月前 (12-22) Google Chrome 76.0.3809.96 Google Chrome 76.0.3809.96 Windows 10 x64 Edition Windows 10 x64 Edition回复
  3. #3

    看起来挺不错哦

    zvv2个月前 (12-18) Google Chrome 79.0.3945.79 Google Chrome 79.0.3945.79 Windows 10 x64 Edition Windows 10 x64 Edition回复
  4. #4

    lala大佬,这个音乐播放器是什么搞出来的 :oops:

    Caisson2个月前 (12-19) Google Chrome 79.0.3945.88 Google Chrome 79.0.3945.88 Windows 10 x64 Edition Windows 10 x64 Edition回复
    • 插件HermitX播放器

      LALA2个月前 (12-20) Google Chrome 74.0.3729.169 Google Chrome 74.0.3729.169 Windows 10 x64 Edition Windows 10 x64 Edition回复
  5. #5

    oneprovider 4c 4G的独服搞这个行不行?平时还会用这个下载pt rclone同步gdrive和OneDrive

    eerdc2个月前 (12-28) Google Chrome 79.0.3945.88 Google Chrome 79.0.3945.88 Windows 10 x64 Edition Windows 10 x64 Edition回复
    • 可以的。

      LALA2个月前 (12-28) Google Chrome 74.0.3729.169 Google Chrome 74.0.3729.169 Windows 10 x64 Edition Windows 10 x64 Edition回复
  6. #6

    后端这里出错了,yarn migrate一直报错

    Error: ER_ACCESS_DENIED_ERROR: Access denied for user ‘dht’@’localhost’ (using password: YES)
    at Handshake.Sequence._packetToError (/opt/scraper/node_modules/mysql/lib/protocol/sequences/Sequence.js:47:14)
    at Handshake.ErrorPacket (/opt/scraper/node_modules/mysql/lib/protocol/sequences/Handshake.js:123:18)
    at Protocol._parsePacket (/opt/scraper/node_modules/mysql/lib/protocol/Protocol.js:291:23)
    at Parser._parsePacket (/opt/scraper/node_modules/mysql/lib/protocol/Parser.js:433:10)

    小东2个月前 (12-31) Google Chrome 77.0.3865.90 Google Chrome 77.0.3865.90 Windows 10 x64 Edition Windows 10 x64 Edition回复
    • 步骤写错了,先nano config/index.js配置好数据库连接信息再yarn migrate

      LALA2个月前 (12-31) Google Chrome 74.0.3729.169 Google Chrome 74.0.3729.169 Windows 10 x64 Edition Windows 10 x64 Edition回复
      • 完成了,我cac的8h4g服务器差点没跑起来,http://dht.v2ex.ltd/
        我这个服务器只有80g,除去4gswap和其他七七八八的,只剩60g了
        这还能爬多少个种子

        小东2个月前 (12-31) Google Chrome 78.0.3904.108 Google Chrome 78.0.3904.108 Windows 10 x64 Edition Windows 10 x64 Edition回复
        • 60G硬盘可以爬很多了。

          LALA2个月前 (01-02) Google Chrome 74.0.3729.169 Google Chrome 74.0.3729.169 Windows 10 x64 Edition Windows 10 x64 Edition回复
  7. #7

    lala,请问我的为什么一直在Doing some maintenance状态,已经一天了 :grin:

    啥啥1个月前 (01-07) Google Chrome 79.0.3945.88 Google Chrome 79.0.3945.88 Windows 10 x64 Edition Windows 10 x64 Edition回复
    • 老哥,你的问题解决没? :razz:

      loeveo1个月前 (01-13) UC Browser 12.8.0.1060 UC Browser 12.8.0.1060 Android 9 Android 9回复
      • 看看es是不是没启动:systemctl status elasticsearch
        没有的自己启动一下:systemctl start elasticsearch

        LALA1个月前 (01-16) Google Chrome 76.0.3809.132 Google Chrome 76.0.3809.132 Windows 10 x64 Edition Windows 10 x64 Edition回复
        • 原来如此,不过centos安装这玩意儿真他娘的麻烦 :razz: :razz:

          loeveo1个月前 (01-17) UC Browser 12.8.0.1060 UC Browser 12.8.0.1060 Android 9 Android 9回复
        • 大佬,我cents 7按照这个教程https://www.jianshu.com/p/e49ed6acd7da 安装好了,es但还是一直停留在Doing some maintenance

          loeveo1个月前 (01-17) UC Browser 12.8.0.1060 UC Browser 12.8.0.1060 Android 9 Android 9回复
          • 看官方的文档装吧:https://www.elastic.co/guide/en/elasticsearch/reference/current/rpm.html
            或者实在不会试下这个,可以用docker简单很多:https://lala.im/6624.html

            LALA4周前 (01-21) Google Chrome 76.0.3809.132 Google Chrome 76.0.3809.132 Windows 10 x64 Edition Windows 10 x64 Edition
  8. #8

    大佬,我安装成功了,但是刷新前端,后面并没有什么反应.我用h top查看,好像也没看到pmr2的程序 :cry: 在运行,但是他又有启动成功的提示

    loeveo1个月前 (01-11) UC Browser 12.8.0.1060 UC Browser 12.8.0.1060 Android 9 Android 9回复
  9. #9

    大佬,我的是centos 7.我查看了一下,好像是安装某一个依赖的时候出错了,报错是这样的
    [root@PAR-162254 scraper]# yarn migrate
    yarn run v1.21.1
    $ ./node_modules/.bin/knex migrate:latest
    Error: ER_ACCESS_DENIED_ERROR: Access denied for user ‘root’@’localhost’ (using password: YES)
    at Handshake.Sequence._packetToError (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/protocol/sequences/Sequence.js:47:14)
    at Handshake.ErrorPacket (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/protocol/sequences/Handshake.js:124:18)
    at Protocol._parsePacket (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/protocol/Protocol.js:278:23)
    at Parser.write (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/protocol/Parser.js:76:12)
    at Protocol.write (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/protocol/Protocol.js:38:16)
    at Socket. (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/Connection.js:91:28)
    at Socket. (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/Connection.js:502:10)
    at emitOne (events.js:96:13)
    at Socket.emit (events.js:188:7)
    at readableAddChunk (_stream_readable.js:176:18)
    at Socket.Readable.push (_stream_readable.js:134:10)
    at TCP.onread (net.js:559:20)
    ——————–
    at Protocol._enqueue (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/protocol/Protocol.js:144:48)
    at Protocol.handshake (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/protocol/Protocol.js:51:23)
    at Connection.connect (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/Connection.js:118:18)
    at /www/wwwroot/163.172.63.104/scraper/node_modules/knex/lib/dialects/mysql/index.js:95:18
    at Promise._execute (/www/wwwroot/163.172.63.104/scraper/node_modules/bluebird/js/release/debuggability.js:303:9)
    at Promise._resolveFromExecutor (/www/wwwroot/163.172.63.104/scraper/node_modules/bluebird/js/release/promise.js:483:18)
    at new Promise (/www/wwwroot/163.172.63.104/scraper/node_modules/bluebird/js/release/promise.js:79:10)
    at Client_MySQL.acquireRawConnection (/www/wwwroot/163.172.63.104/scraper/node_modules/knex/lib/dialects/mysql/index.js:90:12)
    at create (/www/wwwroot/163.172.63.104/scraper/node_modules/knex/lib/client.js:280:23)
    at tryPromise (/www/wwwroot/163.172.63.104/scraper/node_modules/tarn/lib/Pool.js:366:22)
    at tryPromise (/www/wwwroot/163.172.63.104/scraper/node_modules/tarn/lib/utils.js:57:20)
    at Promise (/www/wwwroot/163.172.63.104/scraper/node_modules/tarn/lib/Pool.js:366:5)
    at callbackOrPromise (/www/wwwroot/163.172.63.104/scraper/node_modules/tarn/lib/Pool.js:357:10)
    at Pool._create (/www/wwwroot/163.172.63.104/scraper/node_modules/tarn/lib/Pool.js:307:5)
    at Pool._doCreate (/www/wwwroot/163.172.63.104/scraper/node_modules/tarn/lib/Pool.js:275:32)
    at Pool._tryAcquireOrCreate (/www/wwwroot/163.172.63.104/scraper/node_modules/tarn/lib/Pool.js:212:12)
    error Command failed with exit code 1.
    info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.[root@PAR-162254 scraper]# yarn migrate
    yarn run v1.21.1
    $ ./node_modules/.bin/knex migrate:latest
    Error: ER_ACCESS_DENIED_ERROR: Access denied for user ‘root’@’localhost’ (using password: YES)
    at Handshake.Sequence._packetToError (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/protocol/sequences/Sequence.js:47:14)
    at Handshake.ErrorPacket (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/protocol/sequences/Handshake.js:124:18)
    at Protocol._parsePacket (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/protocol/Protocol.js:278:23)
    at Parser.write (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/protocol/Parser.js:76:12)
    at Protocol.write (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/protocol/Protocol.js:38:16)
    at Socket. (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/Connection.js:91:28)
    at Socket. (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/Connection.js:502:10)
    at emitOne (events.js:96:13)
    at Socket.emit (events.js:188:7)
    at readableAddChunk (_stream_readable.js:176:18)
    at Socket.Readable.push (_stream_readable.js:134:10)
    at TCP.onread (net.js:559:20)
    ——————–
    at Protocol._enqueue (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/protocol/Protocol.js:144:48)
    at Protocol.handshake (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/protocol/Protocol.js:51:23)
    at Connection.connect (/www/wwwroot/163.172.63.104/scraper/node_modules/mysql/lib/Connection.js:118:18)
    at /www/wwwroot/163.172.63.104/scraper/node_modules/knex/lib/dialects/mysql/index.js:95:18
    at Promise._execute (/www/wwwroot/163.172.63.104/scraper/node_modules/bluebird/js/release/debuggability.js:303:9)
    at Promise._resolveFromExecutor (/www/wwwroot/163.172.63.104/scraper/node_modules/bluebird/js/release/promise.js:483:18)
    at new Promise (/www/wwwroot/163.172.63.104/scraper/node_modules/bluebird/js/release/promise.js:79:10)
    at Client_MySQL.acquireRawConnection (/www/wwwroot/163.172.63.104/scraper/node_modules/knex/lib/dialects/mysql/index.js:90:12)
    at create (/www/wwwroot/163.172.63.104/scraper/node_modules/knex/lib/client.js:280:23)
    at tryPromise (/www/wwwroot/163.172.63.104/scraper/node_modules/tarn/lib/Pool.js:366:22)
    at tryPromise (/www/wwwroot/163.172.63.104/scraper/node_modules/tarn/lib/utils.js:57:20)
    at Promise (/www/wwwroot/163.172.63.104/scraper/node_modules/tarn/lib/Pool.js:366:5)
    at callbackOrPromise (/www/wwwroot/163.172.63.104/scraper/node_modules/tarn/lib/Pool.js:357:10)
    at Pool._create (/www/wwwroot/163.172.63.104/scraper/node_modules/tarn/lib/Pool.js:307:5)
    at Pool._doCreate (/www/wwwroot/163.172.63.104/scraper/node_modules/tarn/lib/Pool.js:275:32)
    at Pool._tryAcquireOrCreate (/www/wwwroot/163.172.63.104/scraper/node_modules/tarn/lib/Pool.js:212:12)
    error Command failed with exit code 1.
    info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.

    loeveo1个月前 (01-11) UC Browser 12.8.0.1060 UC Browser 12.8.0.1060 Android 9 Android 9回复
  10. #10

    大佬,我前后端都搞定,以后现在数据库我看见有东西了,但是前端刷新不到那个界面,却是什么情况? :eek:

    loeveo1个月前 (01-11) UC Browser 12.8.0.1060 UC Browser 12.8.0.1060 Android 9 Android 9回复
  11. #11

    大佬,安装成功后,打开了前端网页,注册成功,输入信息登录后,提示:
    Oops!
    Looks like we ran into a little issue. Please try again later.
    这是什么原因?

    yueed3周前 (02-02) Google Chrome 79.0.3945.130 Google Chrome 79.0.3945.130 Windows 10 x64 Edition Windows 10 x64 Edition回复
    • 没遇到过,按道理来说既然都可以注册了,登录进去应该就是正常的才对。

      LALA2周前 (02-03) Google Chrome 74.0.3729.169 Google Chrome 74.0.3729.169 Windows 10 x64 Edition Windows 10 x64 Edition回复
  12. #12

    大佬,注册账户后登录时,提示:”Oops! Looks like we ran into a little issue. Please try again later.”
    设置displayErrorDetails为true后,提示错误如下:

    Slim Application Error
    The application could not run because of the following error:

    Details
    Type: Elasticsearch\Common\Exceptions\BadRequest400Exception
    Code: 400
    Message: {“error”:{“root_cause”:[{“type”:”query_shard_exception”,”reason”:”No mapping found for [seeders] in order to sort on”,”index_uuid”:”nwVbIYF2QR6t6S5EH9Mb9w”,”index”:”torrents”}],”type”:”search_phase_execution_exception”,”reason”:”all shards failed”,”phase”:”query”,”grouped”:true,”failed_shards”:[{“shard”:0,”index”:”torrents”,”node”:”yWoCXa21QYy3V6LhowHbTg”,”reason”:{“type”:”query_shard_exception”,”reason”:”No mapping found for [seeders] in order to sort on”,”index_uuid”:”nwVbIYF2QR6t6S5EH9Mb9w”,”index”:”torrents”}}]},”status”:400}
    File: /opt/www-php/vendor/elasticsearch/elasticsearch/src/Elasticsearch/Connections/Connection.php
    Line: 630

    应该是elasticsearch的问题,但是不知道怎么解决,大佬有时间可以帮忙看看吗

    locd2周前 (02-06) Google Chrome 79.0.3945.130 Google Chrome 79.0.3945.130 Windows 10 x64 Edition Windows 10 x64 Edition回复
    • 看样子是elasticsearch服务没起来,你的机器内存有多少,elasticsearch至少给它2G内存才能运行的比较顺。

      LALA2周前 (02-08) Google Chrome 80.0.3987.87 Google Chrome 80.0.3987.87 Windows 10 x64 Edition Windows 10 x64 Edition回复
      • 4G内存,但是systemctl status elasticsearch显示已经启动,这样算起启动了吗?

        然后我看了数据库里,seeders的值是NULL

        locd2周前 (02-08) Google Chrome 79.0.3945.130 Google Chrome 79.0.3945.130 Windows 10 x64 Edition Windows 10 x64 Edition回复
        • netstat -nltpu看看有没有es的端口在监听,9200/9300之类的,有的话就是启动成功了。

          LALA1周前 (02-10) Google Chrome 74.0.3729.169 Google Chrome 74.0.3729.169 Windows 10 x64 Edition Windows 10 x64 Edition回复
  13. #13

    怎么爬到10229就暂停了呀

    mutiko3天前 Microsoft Edge 18.18363 Microsoft Edge 18.18363 Windows 10 x64 Edition Windows 10 x64 Edition回复
    • es或者mysql挂了?这程序比较吃机器配置。

      LALA3天前 Google Chrome 74.0.3729.169 Google Chrome 74.0.3729.169 Windows 10 x64 Edition Windows 10 x64 Edition回复
      • es正常 sql如果跪了应该查询不了了吧

        mutiko20小时前 Safari 13.0.4 Safari 13.0.4 iPhone iOS 13.3 iPhone iOS 13.3回复
      • 我发现es和sql都是跑着的

        mutiko19小时前 Safari 13.0.4 Safari 13.0.4 iPhone iOS 13.3 iPhone iOS 13.3回复

分享创造快乐

广告合作资源投稿