Preface

Students who often use Selenium or Puppeteer know that the Chrome browser they start is divided into headed mode and headless mode. When operating on your own computer, if it is in head mode, a Chrome browser window will pop up, and then you can see the automatic operation in this browser. The headless mode will not pop up any windows, only the process, that is to say, even if you don't use any hidden feature technology, just use the headed mode, you will be much safer. If the website is not very strict anti-crawler, in many cases, it is easier to be found when using the headless mode, but it is more difficult to be found when using the headed mode

The picture below shows the mode with a head, without using any hidden feature technology to access the detection website
image.png

The picture below shows the headless mode, which does not use any hidden feature technology to access the detection website
image.png

When we want to use Selenium or Puppeteer to run crawlers on Linux servers, we will find that the headed mode will always report errors. This is because the headed mode requires the system to provide graphical interface support in order to draw the browser window, but Linux servers generally do not have a graphical interface, so the headed mode will definitely fail

In this case, in order to be able to use the headed mode of the simulated browser, we need to create a fake graphical interface to deceive the browser so that its headed mode can be used normally

For this purpose we can use something called Xvfb. The introduction of this thing on Wikipedia [1] is as follows
Xvfb or X virtual framebuffer is a display server implementing the X11 display server protocol. In contrast to other display servers, Xvfb performs all graphical operations in virtual memory without showing any screen output.

Xvfb implements the X11 display server protocol on a machine without a graphics device. It implements various interfaces that other GUIs have, but does not have a real GUI. So when a program calls GUI-related operations in Xvfb, these operations will run in virtual memory, but you can't see anything.

Using Xvfb, we can trick Selenium or Puppeteer into thinking that it is running in a system with a graphical interface, so that we can use the headed mode normally.

To install Xvfb is very simple, in Ubuntu, just execute the following two lines of commands:

sudo apt-get update
sudo apt-get install xvfb

Now, let's write a very simple code for Selenium to operate Chrome

import time
from selenium.webdriver import Chrome
driver = Chrome('./chromedriver')
driver.get('https://bot.sannysoft.com/')
time. sleep(5)
driver.save_screenshot('screenshot.png')
driver. close()
print('run completed')

If run directly on the server, the effect is as shown in the figure below
image.png

Because there is no graphical interface, the program must report an error.
Now, we only need to add xvfb-run in front of the command to run this code, and then look at the running effect
image.png

The code runs successfully with no errors. Now we pull down the generated screenshot.png file from the server, and after opening it, we can see the following content
image.png

It can be seen that although the window is relatively small, it is indeed the detection result under the headed mode. Of course, we can also adjust the window size and increase the parameters: xvfb-run python3 test.py -s -screen 0 1920x1080x16 can pretend to run the program on a display with a resolution of 1920x1280. Then modify the Selenium code to set the size of the browser window:
image.png

The running effect is shown in the figure below
image.png

This article demonstrates using Python to operate Selenium. You can also try using Puppeteer, just change the startup command to xvfb-run node index.js.

Likes(1)

Comment list count 0 Comments

No Comments