PHP FFI Details - A new way to extend PHP

With PHP 7.4 comes an extension that I think is very useful: PHP FFI (Foreign Function interface), which refers to a description in a PHP FFI RFC:

For PHP, FFI provides a way to write PHP extensions and bindings to C libraries in pure PHP.

Yes, FFI provides direct calls from high-level languages to each other, while for PHP, FFI allows us to easily call various libraries written in C.

In fact, there are a large number of PHP extensions that wrap some existing C libraries, some commonly used mysqli, curl, gettext s, etc. There are also a large number of similar extensions in PECL.

Traditionally, when we need the ability to use some existing C language libraries, we need to write wrappers in C to package them as extensions. In this process, we need to learn how to write PHP extensions. Of course, there are some convenient ways, a Zephir.However, there is always some learning cost, and with FFI, we can call functions from libraries written in C directly in PHP scripts.

In the decades of C language history, accumulated excellent library, FFI directly allows us to easily enjoy this huge resource.

Back to the point, today I'll use an example to show how we can use PHP to call libcurl to grab the contents of a web page. Why use libcurl?Didn't PHP already have curl extensions?Well, first because I'm familiar with libcurl's api, and secondly, because there is, it's a good comparison. Is the direct usability of traditional extension and FFI methods not?

First of all, some of us take the one you're currently reading Article For example, I now need to write a piece of code to grab its contents. If we use the curl extension of traditional PHP, we would probably write as follows:

<?php
 
$ url  =  " https://www.laruence.com/2020/03/11/5475.html" ;
$ ch  =  curl_init ();
 
curl_setopt ($ ch , CURLOPT_URL , $ url );
curl_setopt ($ ch , CURLOPT_SSL_VERIFYPEER , 0 );
 
curl_exec ($ ch );
 
curl_close ($ ch );

(Since my website is https, there will be one more operation to set up SSL_VERIFYPEER) What if it is FFI?

First, to enable ext / ffi for PHP 7.4, it is important to note that PHP-FFI requires libffi-3 or more.

Then, we need to tell PHP FFI what the prototype of the function we are calling looks like, so we can use FFI:: cdef, its prototype is:

FFI :: cdef ([ string $ cdef  =  ""  [, string $ lib  = null ]]):  FFI

In the string $cdef, we can write a C-language functional declaration, FFI parse it, learn what the signature of the function we call in the string $lib library is. In this example, we use three libcurl functions whose declarations can be found in the libcurl documentation. Some about curl_easy_init.

Specifically for this example, let's write a curl.php that contains everything to declare, code as follows:

$ libcurl  = FFI :: cdef (<<< CTYPE
//Invalid * curl_easy_init();
int curl_easy_setopt ( void * curl , int Options, ...);
int curl_easy_perform ( void * curl );
void curl_easy_cleanup ( void * handle );
//type
 , " libcurl.so"
 );

Here's a place where the return value is CURL *, but in fact because we won't dereference it in our example, we just pass it, so avoid the hassle of replacing it with void *.

However, one more hassle is that PHP is predefined:

<?php
const CURLOPT_URL =  10002 ;
const CURLOPT_SSL_VERIFYPEER =  64 ;
 
$ libcurl  = FFI :: cdef (<<< CTYPE
//Invalid * curl_easy_init();
int curl_easy_setopt ( void * curl , int Options, ...);
int curl_easy_perform ( void * curl );
void curl_easy_cleanup ( void * handle );
//type
 , " libcurl.so"
 );

Okay, even if the definition is complete, now let's finish the actual logic, and the whole code will be:

<?php
//"curl.php" is required;
 
$ url  =  " https://www.laruence.com/2020/03/11/5475.html" ;
 
$ ch  =  $ libcurl- > curl_easy_init ();
$ libcurl- > curl_easy_setopt ($ ch , CURLOPT_URL , $ url );
$ libcurl- > curl_easy_setopt ($ ch , CURLOPT_SSL_VERIFYPEER , 0 );
 
$ libcurl- > curl_easy_perform ($ ch );
 
$ libcurl- > curl_easy_cleanup ($ ch );

How about scales that use curl extensions, are they just as concise?

Next, let's get a little more complicated, until, if we don't want the result to be output directly, we return it to a string. For the curl extension of PHP, we just need to call curl_setop to make CURLOPT_RETURNTRANSFER 1, but in libcurl we don't have the ability to return the string directly, or provide an alternative function for WRITEFUNCTION.This function is called by libcurl when data is returned, as is the case with the PHP curl extension.

At present, we can't pass a PHP function as an additional function directly to libcurl via FFI. We all have two ways to do that:

1. With WRITEDATA, the default libcurl calls fwrite as a variable function, and we can give libcurl an fd through WRITEDATA so that it does not write stdout, but writes to this fd
2. We write a C-to-Simple function ourselves, passing in the FFI date to libcurl.

Let's start with the first method, first we need to use fopen, this time by defining a C header file to declare the prototype (file.h):

void * fopen ( char *File name, char *Mode);
void fclose ( void * fp );

Like file.h, we put all the libcurl function declarations in curl.h

#Definition FFI_LIB "libcurl.so"
 
//Invalid * curl_easy_init();
int  curl_easy_setopt (void  * curl , int Options, ...);
int  curl_easy_perform (void  * curl );
void  curl_easy_cleanup (CURL * handle ); 

Then we can use FFI:: load to load the.h file:

Static function load (string $filename): FFI;

But how do you tell FFI to load that library?As mentioned above, we tell FFI that these functions come from libcurl.so by defining a macro for FFI_LIB. When we load this h file with FFI:: load, PHP FFI automatically loads libcurl.so.

Why doesn't fopen need to specify a load library? That's because FFI also looks for symbols in the variable symbol table, and fopen is a standard library function that already exists.

Okay, now the whole code will be:

<?php
const CURLOPT_URL =  10002 ;
const CURLOPT_SSL_VERIFYPEER =  64 ;
const CURLOPT_WRITEDATA =  10001 ;
 
$ libc  = FFI :: load (" file.h" );
$ libcurl  = FFI :: load (" curl.h" );
 
$ url  =  " https://www.laruence.com/2020/03/11/5475.html" ;
$ tmpfile  =  " /tmp/tmpfile.out" ;
 
$ ch  =  $ libcurl- > curl_easy_init ();
$ fp  =  $ libc- > fopen ($ tmpfile , " a" );
 
$ libcurl- > curl_easy_setopt ($ ch , CURLOPT_URL , $ url );
$ libcurl- > curl_easy_setopt ($ ch , CURLOPT_SSL_VERIFYPEER , 0 );
$ libcurl- > curl_easy_setopt ($ ch , CURLOPT_WRITEDATA , $ fp );
$ libcurl- > curl_easy_perform ($ ch );
 
$ libcurl- > curl_easy_cleanup ($ ch );
 
$ libc- > fclose ($ fp );
 
$ ret  =  file_get_contents ($ tmpfile );
@unlink ($ tmpfile );

But this is either a temporary transfer file or it's not elegant enough. Now let's use the second method, which we need to write an alternate function in C to pass to libcurl:

#include  <stdlib.h>
#include  <string.h>
#include  " write.h"
 
size_t own_writefunc (void * ptr ,size_t size ,size_t nmember ,void * data ){         
        own_write_data * d = ( own_write_data *)Data;  
        size_t  total =Size* nmember ;
 
        //If (d- > buf == NULL) {
                d- > buf =  malloc ( total );
                //If (d- > buf == NULL) {
                        //Return to 0;
                }
                d- > size = total ;
                memcpy ( d- > buf , ptr , total );
        }  Other {
                d- > buf =Reallocation ( d- > buf , d- > size + total );
                //If (d- > buf == NULL) {
                        //Return to 0;
                }
                memcpy ( d- > buf + d- > size , ptr , total );
                d- > size + = total ;
        }
 
        //Total return;
}
 
//Invalid * init(){
        return  & own_writefunc ;
}

Note the initial function here, because in PHP FFI we cannot get a function pointer directly with the current version (2020-03-11), so we define this function to return the address of our_writefunc.

Finally, we define the header file write.h used above:

Define FFI_LIB "write.so"
 
typedef  struct _writedata {  
        Invalid * buf;
        size_t size;
} own_write_data ;
 
Invalid * init();

Notice that we also define FFI_LIB in the header file so that it can be used by both write.c and the next PHP FFI.

Then we compile the write function as a dynamic library:

gcc -O2 -fPIC -shared -g write.c -o write.so

Okay, now the whole code will become:

<?php
const CURLOPT_URL =  10002 ;
const CURLOPT_SSL_VERIFYPEER =  64 ;
const CURLOPT_WRITEDATA =  10001 ;
const CURLOPT_WRITEFUNCTION =  20011 ;
 
$ libcurl  = FFI :: load (" curl.h" );
$ write   = FFI :: load (" write.h" );
 
$ url  =  " https://www.laruence.com/2020/03/11/5475.html" ;
 
$ data  =  $ write- > new (" own_write_data" );
 
$ ch  =  $ libcurl- > curl_easy_init ();
 
$ libcurl- > curl_easy_setopt ($ ch , CURLOPT_URL , $ url );
$ libcurl- > curl_easy_setopt ($ ch , CURLOPT_SSL_VERIFYPEER , 0 );
$ libcurl- > curl_easy_setopt ($ ch , CURLOPT_WRITEDATA , FFI :: addr ($ data )); Copy Code
$ libcurl- > curl_easy_setopt ($ ch , CURLOPT_WRITEFUNCTION , $ write- > init ());
$ libcurl- > curl_easy_perform ($ ch );
 
$ libcurl- > curl_easy_cleanup ($ ch );
 
ret = FFI :: String ( $ data- > buf , $ data- > size );

Here, we use FFI:: new ($write-> new) to allocate memory for a structure _write_data:

Function FFI:: New (mixed $type [, bool $own = true [, bool $persistent = false]]): FFI \ CData

$own indicates whether this memory management uses PHP's memory management. Sometimes the memory we request is managed by PHP's life cycle and does not need to be freed actively, but sometimes you may want to manage it yourself. You can set $own to be flase, then you need to call FFI when appropriate:: free to release actively.

Then we pass $data to libcurl as WRITEDATA, where we use FFI:: addr to get the actual memory address of $data:

Static function address (FFI \CData$cdata): FFI \CData;

We then pass our own_write_func as WRITEFUNCTION to libcurl, so that when it returns, libcurl will call our own_write_func to process the return and pass the write_data as a custom parameter to our replacement function.

Finally, we use FFI:: string to convert a section of memory into a PHP string:

Static Function FFI:: String (FFI \ CData $SRC [, int $size]): String

When $size is not provided, FFI:: String stops when Null-byte is encountered.

Okay, let's run?

After all, loading so for every request directly in PHP can be a big performance issue, so we can also use preload mode, in which we load the so at PHP startup through opcache.preload:

ffi.enable = 1
opcache.preload = ffi_preload.inc

ffi_preload.inc:

<?php
FFI :: load (" curl.h" );
FFI :: load (" write.h" );

But what about the loaded FFI?So we need to modify these two.H headers and add FFI_SCOPE, such as curl.h:

#Definition FFI_LIB "libcurl.so"
#Definition FFI_SCOPE "Of libcurl"
 
//Invalid * curl_easy_init();
int  curl_easy_setopt (void  * curl , int Options, ...);
int  curl_easy_perform (void  * curl );
void  curl_easy_cleanup (void  * handle );

We added FFI_SCOPE as "write" to write.h, and our script now looks like this:

<?php
const CURLOPT_URL =  10002 ;
const CURLOPT_SSL_VERIFYPEER =  64 ;
const CURLOPT_WRITEDATA =  10001 ;
const CURLOPT_WRITEFUNCTION =  20011 ;
 
$ libcurl  = FFI :: Range (" libcurl" );
$ write   = FFI :: Range (" write" );
 
$ url  =  " https://www.laruence.com/2020/03/11/5475.html" ;
 
$ data  =  $ write- > new (" own_write_data" );
 
$ ch  =  $ libcurl- > curl_easy_init ();
 
$ libcurl- > curl_easy_setopt ($ ch , CURLOPT_URL , $ url );
$ libcurl- > curl_easy_setopt ($ ch , CURLOPT_SSL_VERIFYPEER , 0 );
$ libcurl- > curl_easy_setopt ($ ch , CURLOPT_WRITEDATA , FFI :: addr ($ data )); Copy Code
$ libcurl- > curl_easy_setopt ($ ch , CURLOPT_WRITEFUNCTION , $ write- > init ());
$ libcurl- > curl_easy_perform ($ ch );
 
$ libcurl- > curl_easy_cleanup ($ ch );
 
ret = FFI :: String ( $ data- > buf , $ data- > size );

That is, we now use FFI:: scope instead of FFI:: load, referencing the corresponding function.

Static function range (string $name): FFI;

Then there's another problem. Although FFI gives us a lot of scale, it's still very risky to call C library functions directly. We should only allow users to call functions we've confirmed. So ffi.enable = preload is ready, when we set ffi.enable =If preload, then only functions in the script of opcache.preload can call FFI, and functions written by the user cannot be called directly.

Let's make a slight change from ffi_preload.inc to ffi_safe_preload.inc

<?php
CURLOPT class{
     const URL =  10002 ;
     const SSL_VERIFYHOST =  81 ;
     const SSL_VERIFYPEER =  64 ;
     const WRITEDATA =  10001 ;
     const WRITEFUNCTION =  20011 ;
}
 
FFI :: load (" curl.h" );
FFI :: load (" write.h" );
 
//Function get_libcurl(): FFI {
     //Returns FFI: Range ("libcurl");
}
 
//Function get_write_data ($write): FFI \ CData {
     //Return $write - > New ("own_write_data");
}
 
//Function get_write(): FFI {
     //Returns FFI:: range ("write");
}
 
//Function get_data_addr ($data): FFI \ CData {
     //Return FFI:: addr ($data);
}
 
//Function paser_libcurl_ret ($data): string {
     //Returns FFI:: String ($data - > buf, $data - > size);
}

That is, we define all the functions that call the FFI API in the preload script, and then our example becomes (ffi_safe.php):

<?php
$ libcurl  =  get_libcurl ();
$ write   =   get_write ();
$ data  =  get_write_data ($ write );
 
$ url  =  " https://www.laruence.com/2020/03/11/5475.html" ;
 
 
$ ch  =  $ libcurl- > curl_easy_init ();
 
$ libcurl- > curl_easy_setopt ($ ch , CURLOPT :: URL , $ url );Copy Code
$ libcurl- > curl_easy_setopt ($ ch , CURLOPT :: SSL_VERIFYPEER , 0 );
$ libcurl- > curl_easy_setopt ($ ch , CURLOPT :: WRITEDATA , get_data_addr ($ data ));Copy Code
$ libcurl- > curl_easy_setopt ($ ch , CURLOPT :: WRITEFUNCTION , $ write- > init ());
$ libcurl- > curl_easy_perform ($ ch );
 
$ libcurl- > curl_easy_cleanup ($ ch );
 
$ ret  =  paser_libcurl_ret ($ data );

This way, with ffi.enable = preload, we can restrict all FFI API s to be called only by preload scripts that we can control, not directly by users.So we can do a good job of security assurance within these functions, so as to ensure a certain degree of security.

Well, after experiencing this example, you should have a more in-depth understanding of FFI, detailed PHP API instructions, you can refer to: PHP-FFI Manual If you are interested, go to find a C library, try it?

An example of this article can be downloaded from my github: FFI example

Last but not least, the example is just for demonstrating the function, so it eliminates a lot of error branch judgment capture, so you should join when you write your own.After all, using FFI can cause PHP segfault s to crash in 1,000 ways, so be careful.

The above content is intended to help you. Many PHPer s will encounter some problems and bottlenecks when they are advanced. Business code is written too much and has no sense of direction. I don't know where to start to improve. I have sorted out some data about it, including but not limited to: distributed architecture, high scalability, high performance, high concurrency, server performance tuning, TP6, laravel, YII2, Redis, SwooAdvanced dry goods needs of many knowledge points, such as le, Swoft, Kafka, Mysql optimization, shell script, Docker, micro-service, Nginx, etc., can be shared for free. Need to stamp here

Keywords: Programming PHP curl C github

Added by $SuperString on Mon, 30 Mar 2020 06:00:01 +0300