blog.mirabellette.eu

A blog about digital independence and autonomy

Advertising domain name blocking with Unbound

Written by Mirabellette / / 6 comments

I realized that Shaft made his script available here ... It is more powerful but also longer than this one because it makes some verification. To be honest, I think it is also better in some way. Feel free to combine them to make your own.

Hello everyone,

Today I want to talk to you about advertising in Internet and how to block a part of it with a domain name resolver like Unbound.

You must be aware that there are thousands of way to track user's activities on internet. A good protection against this kind of things is to directly block the resolution of the domain which is trying to gather information about you. It is, of course, not perfect but it is a first good step to begin to reduce tracking about your online activity.

Sometime I read journalduhacker.net, it is a website which gathering "good" article from French open source community. I found a very interesting article from Shaft about blocking a list of domain name with unbound. It is a very nice article which present how do it. It mention a very good trick to reduce the size of the ads list and the ram load of unbound. Thanks to him for his sharing. I just got a warning message with unbound, I don't know why but it works. I will investigate in it later and will of course tell you how to fix it. The warning message is like that:

[1520173472] unbound[1259:0] warning: duplicate local-zone

Unfortunately, I didn't find a script to modify ads list file from the source directly. They are commonly wrote like a host file. That's why I decided to made it by myself and to share it. I delete comments and other information in the original source file in a very strictly way. I do it in order to avoid any problem with Unbound. Some domain name could be deleted from the source list but with ~97400 domain name in it, I think the script I made works well enough.

Most of ads list in the script are from Shaft article. I add this one too which is well reputed.
Thanks to Sabre comment, I discovered that StevenBlack already provide an unique host list which contains AdAway, yoyo.org and MVPS hosts list. You can access to his list here. It is the one which is now in the script.

vim /etc/unbound/unbound.conf.d/generate_domains_list_ban.sh

# list of ads domain names
array=( https://raw.githubusercontent.com/StevenBlack/hosts/master/alternates/fakenews-gambling-porn-social/hosts )

for i in "${array[@]}"
do
  wget $i -O w
  grep -v " #\|<td>\|<p>\|<meta>\|<link>\|<title>\|href\|title=\|=\|<" w > adsList.txt
  rm w
  dos2unix adsList.txt

  # remove host syntax and clean file
  sed -i 's/0.0.0.0//g' adsList.txt
  sed -i 's/127.0.0.1//g' adsList.txt
  sed -i 's/localhost//g' adsList.txt
  sed -i 's/.localdomain//g' adsList.txt

  # remove commentary after domain name
  sed -i 's/#.*//' adsList.txt

  # remove tabulation character and carriage return
  sed -i "s/\t//g" adsList.txt
  sed -i "s/\r//g" adsList.txt

  # remove useless space
  sed -i 's/ //g' adsList.txt

  # remove empty lines
  sed -i '/^\s*$/d' adsList.txt

  # add prefix and suffix for unbound
  sed -i "s/.*/local-zone: \"&\" static/" adsList.txt

  cat adsList.txt >> adsListFinal.txt
done

# order list by name, it didn't cost a lot and could maybe increase unbound performance
sort adsListFinal.txt -o adsListFinal.txt

# remove duplicate ads domain in order to avoid warning with Unbound
uniq adsListFinal.txt > adslist.txt

# remove tempory files
rm adsListFinal.txt adsList.txt

service unbound restart

You now have to tell to Unbound to load the advertise domain list. Add this line to /etc/unbound/unbound.conf and under the parameter server:

# include: /YOUR_ADS_LIST_PATH
include: /etc/unbound/unbound.conf.d/adslist.txt

At the end of the process, I got a file of 4.1M with ~97400 domain names in it. Contrary to what we could think, It isn't slow. We just have to create a crontab job to be sure the list is oftenly updated. I think to update it each week is a good schedule.

# 5 2 * * Sun /YOUR_GENERATE_ADS_LIST_SCRIPT_PATH
5 2 * * Sun /etc/unbound/unbound.conf.d/generate_domains_list_ban.sh .sh

It took me hours to make the script and this article. I hope you will find it useful and interesting. Don't hesitate to comment it and share it.
Thank you for reading.

sources

Social media

If you find this article interesting, feel free to subscribe to my RSS flux and to follow me on Mastodon. Don't hesitate to share it if you think he could interest someone else.

Classified in : Privacy / Tags : none

6 comments

#1  - GK said :

Perhaps the "duplicate local-zone" warning is because there are e.g. two lines that differ only in whitespace. In that case uniq would keep both lines, but unbound might still parse them as duplicates.

Reply
#2  - Mirabellette said :

You right, it was the reason to this warning, I fix the script and it works.

Reply
#3  - rjc said :

`sort` first, then run `uniq`

Reply
#4  - Mirabellette said :

I don't know why but you right! I exchange the order of sort and uniq and it fixed the warning. Thank you for your comment!

Reply
#5  - Sabre said :

You don't need to add 3 out of the 4 following HOSTS files to the script, as they are already included in StevenBlack's Hosts file (as specified here https://github.com/StevenBlack/hosts#sources-of-hosts-data-unified-in-this-variant) :

AdAway
yoyo.org
MVPS

Reply
#6  - Mirabellette said :

Thank you for your comment!

I updated the article and script. It now only downloads the most complete list from StevenBlack.

Reply

Rss feed of the article's comments

Write a comment

What is the last letter of the word yemy?