Personalized recommendation without login? A detailed explanation of browser fingerprints

In daily life, biometric technology has been the standard configuration of most smart phones. Most mobile phones have face recognition, fingerprint recognition and other functions. At present, fingerprint recognition technology has been very mature. But what we want to talk about today is not fingerprint identification in biometrics, but browser fingerprint. Many people love and hate this technology. Why on earth? Let's get to know more about browser fingerprints today.

What is a browser fingerprint

Browser fingerprint can track the Web browser through the configuration and setting information visible to the website. It has individual identification just like the fingerprint on our hands, but at this stage, the browser fingerprint identifies the browser.

The information of browser fingerprint identification can be UA, time zone, geographical location or language used, etc. the information developed by the browser determines the accuracy of browser fingerprint.

For the website, getting the browser fingerprint has no practical value. What is really valuable is the user information corresponding to the browser fingerprint. As a webmaster, collecting user browser fingerprints and recording user operations is a valuable behavior, especially for scenes without user identity.

For example, in A video website, user A who is not registered with the website likes to browse the secondary video and record this through the browser fingerprint, so he can directly push the secondary video to the browser next time. Because most of the current Internet devices are private, this push method is easy to get the favor of most users, so as to make it become the user of the website.

Development of browser fingerprint

Like most technologies, the development of browser fingerprint technology is not achieved overnight. The existing generations of browser fingerprint technology are as follows:

  • The first generation is stateful, mainly focusing on the user's cookies and evercookie. Users need to log in to get effective information.

  • The second generation has the concept of browser fingerprint, which makes users more differentiated by continuously increasing the characteristic value of the browser, such as UA, browser plug-in information, etc

  • The third generation has focused on people. By collecting users' behaviors and habits to establish eigenvalues and even models for users, real tracking technology can be realized. However, the implementation is complex and is still under exploration.

At present, the tracking technology of browser fingerprint can be regarded as entering the 2.5 generation, which is because the problem of cross browser fingerprint identification is still not solved.

Fingerprint acquisition

Information entropy is the average amount of information contained in each message received. The higher the information entropy, the more information can be transmitted. The lower the information entropy, the less information can be transmitted.

Browser fingerprint is synthesized by the characteristic information of many browsers, and the information entropy of characteristic values is also different. Therefore, fingerprints are also divided into basic fingerprints and advanced fingerprints.

Basic fingerprint

The basic fingerprint is the part that is easy to be found and modified, such as the header of http.

{  "headers": {    
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3",     
    "Accept-Encoding": "gzip, deflate, br",     
    "Accept-Language": "zh-CN,zh;q=0.9,en;q=0.8",     
    "Host": "httpbin.org",     
    "Sec-Fetch-Mode": "navigate",     
    "Sec-Fetch-Site": "none",     
    "Sec-Fetch-User": "?1",     
    "Upgrade-Insecure-Requests": "1",     
    "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36"
  }}

In addition to the fingerprint obtained from http, you can also obtain the characteristic information of the browser through other methods, such as:

  • UA per browser

  • HTTP ACCEPT header sent by browser

  • Browser extensions / plug-ins installed in the browser, such as Quicktime, Flash, Java or Acrobat, and versions of these plug-ins

  • Fonts installed on your computer.

  • Does the browser execute JavaScript scripts

  • Can the browser plant various cookies and "super cookies"

  • Is the browser set to "Do Not Track"

  • System platform (e.g. Win32, Linux x86)

  • System language (e.g. cn, en US)

  • Does the browser support touch screen

After obtaining these values, some operations can be carried out to obtain the specific information entropy of the browser fingerprint and the uuid of the browser.

This information is similar to human weight, height and skin color. It has a great repetition probability and can only be used as auxiliary identification. Therefore, we need more accurate fingerprints to judge the uniqueness.

Advanced fingerprint

Ordinary fingerprints are not enough to distinguish unique individuals. At this time, advanced fingerprints are needed to further narrow the scope and even generate a unique cross browser identity.

The information used to produce fingerprints can be divided into weight, and the information with large information entropy will have a large weight.

In the paper cross browser fingering via OS and hardware level features[ http://yinzhicao.org/TrackingFree/crossbrowsertracking_NDSS17.pdf ]The information entropy and stability of each index are studied in detail.

It can be seen from this paper that the weight of time zone, screen resolution and color depth, information entropy of Canvas and webGL on cross browser fingerprint is relatively large. Let's take a look at what information these advanced fingerprints contain.

Canvas fingerprint

Canvas is a dynamic drawing label in HTML5. It can also be used to generate pictures or process pictures. Even if canvas is used to draw the same elements, due to different systems, different font rendering engines, different anti aliasing, sub-pixel rendering and other algorithms, canvas turns the same text into pictures, and the results are also different.

The implementation code is roughly as follows: render some text on the canvas and then convert it into toDataURL. Even if privacy mode is enabled, you can get the same value.

function getCanvasFingerprint () {    
    var canvas = document.createElement('canvas');    
    var context = canvas.getContext("2d");    
    context.font = "18pt Arial";    
    context.textBaseline = "top";    
    context.fillText("Hello, user.", 2, 2);    
    return canvas.toDataURL("image/jpeg");
}
getCanvasFingerprint()

The process is very simple. Render text. toDataURL exports the contents of the entire Canvas to get the value.

WebGL fingerprint

WebGL (Web Graphics Library) is a JavaScript API that can render high-performance interactive 3D and 2D graphics in any compatible web browser without using plug-ins. WebGL does this by introducing an API that is very consistent with OpenGL ES 2.0, which can be used in HTML5 elements. This consistency allows the API to take advantage of hardware graphics provided by user devices for acceleration. Websites can use WebGL to identify device fingerprints. Generally, there are two ways to achieve fingerprint production:

WebGL report - the complete WebGL browser report form is available and detectable. In some cases, it is converted to a hash value for faster analysis.

WebGL images - hidden 3D images rendered and converted to hash values. Since the final result depends on the hardware device performing the calculation, this method generates unique values for different combinations of devices and their drivers. This generates unique values for different device combinations and drivers.

You can check the website through Browserleaks test to see what information the website can obtain through this API.

The principle of generating WebGL fingerprint is to draw a gradient object with shaders and convert the image into Base64 string. Then enumerate all the extensions and functions of WebGL and add them to the base64 string to produce a huge string, which may be very unique on each device.

For example, the WebGL fingerprint production method of the fingerprint 2JS Library:

// Partial code 
gl = getWebglCanvas()    
if (!gl) { return null }    
var result = []    
var vShaderTemplate = 'attribute vec2 attrVertex;varying vec2 varyinTexCoordinate;uniform vec2 uniformOffset;void main(){varyinTexCoordinate=attrVertex+uniformOffset;gl_Position=vec4(attrVertex,0,1);}'
var fShaderTemplate = 'precision mediump float;varying vec2 varyinTexCoordinate;void main() {gl_FragColor=vec4(varyinTexCoordinate,0,1);}'
var vertexPosBuffer = gl.createBuffer()    
gl.bindBuffer(gl.ARRAY_BUFFER, vertexPosBuffer)    
var vertices = new Float32Array([-0.2, -0.9, 0, 0.4, -0.26, 0, 0, 0.732134444, 0])
// The data store of the Buffer object is created and initialized.
gl.bufferData(gl.ARRAY_BUFFER, vertices, gl.STATIC_DRAW) 
vertexPosBuffer.itemSize = 3
vertexPosBuffer.numItems = 3
// Create and initialize a WebGLProgram object.
var program = gl.createProgram()
// Create shader object
var vshader = gl.createShader(gl.VERTEX_SHADER)
// The next two lines configure shaders 
gl.shaderSource(vshader, vShaderTemplate)  // Set shader code  
gl.compileShader(vshader) // Compile a shader for use by WebGLProgram objects
    
var fshader = gl.createShader(gl.FRAGMENT_SHADER)   
gl.shaderSource(fshader, fShaderTemplate)    
gl.compileShader(fshader)    
// Add predefined vertex shaders and clip shaders  
gl.attachShader(program, vshader)
gl.attachShader(program, fshader) 
// Link WebGLProgram object   
gl.linkProgram(program)
// The defined WebGLProgram object is added to the current rendering state  
gl.useProgram(program)    
program.vertexPosAttrib = gl.getAttribLocation(program, 'attrVertex')    
program.offsetUniform = gl.getUniformLocation(program, 'uniformOffset')                           gl.enableVertexAttribArray(program.vertexPosArray)    
gl.vertexAttribPointer(program.vertexPosAttrib, vertexPosBuffer.itemSize, gl.FLOAT, !1, 0, 0)    
gl.uniform2f(program.offsetUniform, 1, 1)
// Draws an entity from a vector array  
gl.drawArrays(gl.TRIANGLE_STRIP, 0, vertexPosBuffer.numItems)    
try {        
    result.push(gl.canvas.toDataURL())    
} catch (e) {        
    /* .toDataURL may be absent or broken (blocked by extension) */
}

How to prevent "user fingerprint" from being generated

The article also mentioned at the beginning that many people love and hate browser technology. Because a large number of websites use various technologies to "generate" user fingerprints, so as to bring more accurate recommendations to website users and meet users' browsing habits. While users enjoy the convenience brought by technology, they will inevitably feel anxious and uneasy about "privacy disclosure". So how do we prevent "user fingerprints" from being generated?

Confusing Canvas fingerprints

We have learned how to obtain canvas fingerprints, so how to prevent malicious acquisition? To confuse canvas fingerprints, you just need to fiddle with the results obtained from toDataURL.

toDataURL() exports the contents of the whole Canvas. We need to modify some contents in the Canvas. At this time, we can copy the pixel data of the specified rectangle on the Canvas through getImageData(), then put the image data back through * * putImageData() * * and then use toDataURL() to export the pictures.

CanvasRenderingContext2D.getImageData() returns an ImageData object to describe the pixel data implied in the Canvas area. This area is represented by a rectangle with a starting point of (sx, sy), a width of sw and a height of sh.

The ImageData interface describes an area of implicit pixel data of an element, which can be constructed by the ImageData() method, or by the creation methods of canvas renderingcontext2d objects together: createImageData() and getImageData().

The ImageData object stores the real pixel data of the canvas object. It contains several read-only attributes:

  • Width picture width, in pixels

  • Height picture height, in pixels

  • data

A bit array of uint8clapedarray type, containing RGBA integer data, ranging from 0 to 255. It can be regarded as initial pixel data. Each pixel uses four 1-byte values (in the order of red, green, blue and alpha), and each color value is represented by a number from 0 to 255. Each part is assigned to a continuous index within an array. The red part of the first pixel in the upper left corner is located in bit 0 of the array index. Pixels are processed from left to right and from top to bottom, traversing the entire array.

Unit8clapedarray contains data with height and width of 4 bytes, and the index value ranges from 0 ~ (wh4)-1.

For example, if the blue part of the pixel located in row 50 and column 200 in the picture is read, then:

const blueComponent = imageData[50*(imageData.width * 4) + 200*4 + 2]

Here's how to confuse Canvas fingerprints:


const toBlob = HTMLCanvasElement.prototype.toBlob;
const toDataURL = HTMLCanvasElement.prototype.toDataURL;
HTMLCanvasElement.prototype.manipulate = function() {
  const {width, height} = this;
  // Get the CanvasRenderingContext2D generated by the canvas before toDataURL or toBlob
  const context = this.getContext('2d'); 
  const shift = {
    'r': Math.floor(Math.random() * 10) - 5,
    'g': Math.floor(Math.random() * 10) - 5,
    'b': Math.floor(Math.random() * 10) - 5
  };
  const matt = context.getImageData(0, 0, width, height);
  // The values of r, g and b parts of each pixel in the imageData (pixel source data) generated by getImageData are randomly changed to generate a unique image.
  for (let i = 0; i < height; i += Math.max(1, parseInt(height / 10))) {
    for (let j = 0; j < width; j += Math.max(1, parseInt(width / 10))) {
      const n = ((i * (width * 4)) + (j * 4));
      matt.data[n + 0] = matt.data[n + 0] + shift.r; // Plus random disturbance
      matt.data[n + 1] = matt.data[n + 1] + shift.g;
      matt.data[n + 2] = matt.data[n + 2] + shift.b;
    }
  }
  context.putImageData(matt, 0, 0); // Put it back
// Modify prototype toBlob
Object.defineProperty(HTMLCanvasElement.prototype, 'toBlob', {
  value: function() {
    if (script.dataset.active === 'true') {
      try {
        this.manipulate(); // Before each toBlob, first confuse the ImageData
      }
      catch(e) {
        console.warn('manipulation failed', e);
      }
    }
    return toBlob.apply(this, arguments);
  }
});
// Modify prototype toDataURL
Object.defineProperty(HTMLCanvasElement.prototype, 'toDataURL', {
  value: function() {
    if (script.dataset.active === 'true') {
      try {
        this.manipulate(); // Before each toDataURL, confuse ImageData
      }
      catch(e) {
        console.warn('manipulation failed', e);
      }
    }
    return toDataURL.apply(this, arguments);
  }
});

Confuse other fingerprints

It is consistent with the previous idea of confusing canvas fingerprint, which is to change the prototype of the acquired object.

For example, to confuse the time zone is to change the date prototype. The return value of gettimezoneoffset.

The confusion resolution is to change the documentelement clientHeight documentElement. clientWidth

To confuse WebGL, you need to change the WebGLbufferData getParameter method, and so on.

Of course, we also have some simple methods to prevent user fingerprints from being generated. For example, we can use the browser extensions (Canvas Blocker, WebGL Fingerprint Defender, Fingerprint Spoofing, etc.) to execute a section of JS code before loading the web page, change and rewrite various functions of JS to prevent the website from obtaining various information, or return a false data, so as to protect our privacy information.

Keywords: Front-end Cyber Security

Added by PHPFreaksMaster on Tue, 25 Jan 2022 08:23:54 +0200