Riimu.net

title image

Troubleshooting Xdebug remote debugging sessions

Posted on

As I've mentioned previously, remote debugging is one of the greatest features offered by Xdebug. Remote debugging allows you to open your website on your browser, which initiates a step by step debugger on your preferred IDE.

In theory, setting up a remote debugging session should be as simple as following these steps:

  1. Install the Xdebug extension in PHP, e.g. pecl install xdebug
  2. Enable the extension in your php.ini, e.g. add the following to your php.ini: zend_extension="xdebug.so"
  3. Enable the remote debugging in your php.ini configuration, e.g. xdebug.remote_enable = 1
  4. Make your IDE to listen to remote debugging connections
  5. Start the remote debugging session in your browser using an addon or by adding ?XDEBUG_SESSION_START=my_session to the URL.

Sometimes things are not quite as simple. There have been plenty of times when I've had to figure out why remote debugging wasn't working on my own development environment or had to help others to figure out why it wasn't working for them.

In this article, I hope to go through some of the common issues that I've encountered when setting up Xdebug and how you can detect and solve these issues.

Making sure Xdebug is enabled

The first thing you absolutely want to make sure that the Xdebug extension is actually enabled. You also want to do this on your web page. Even though you might see a line in php -v output indicating that Xdebug is enabled, it doesn't mean that your server has loaded the extension.

The easiest way to check this is to create a page that simply contains the following piece of code:

<?php phpinfo();

Usually, it's easiest to replace whatever you have as your application entry point (e.g. index.php) with this.

This piece of code will display your server PHP configuration and all relevant information. To find out if you have Xdebug extension enabled, just search for the page for xdebug. If you can't find the appropriate section in your info page, it means the extension has not been loaded.

There are a number of reasons why Xdebug might have not been loaded. Here are some common issues that you should check:

  • Ensure that the extension is loaded via the php.ini file loaded by the server. Use the "Loaded Configuration File" in your phpinfo() to verify that you added the zend_extension directive to the correct ini file (or any other ini file indicated by the phpinfo()).
  • Make sure you actually restarted your server. I would recommend stopping your server and using something like sudo lsof -nPi|grep httpd to ensure that no server process is actually running. There have been times when I've actually had to manually kill server processes in order to restart it because none of the command line commands worked.
  • Check your server error logs to see if there have been problems trying to load the extension. It's possible, for example, that your extension_dir setting is incorrect, which makes PHP unable to find the extension.

Check your Xdebug configuration

Once you've verified in phpinfo() that you actually have the Xdebug extension loaded, but remote debugging is still not working, it's time to check your configuration next. Remember, that you should always use phpinfo() to ensure that any configuration change you make is reflected on the server.

The three most important configuration values are xdebug.remote_enable, xdebug.remote_host and xdebug.remote_port.

The value of xdebug.remote_enable should be "1". Otherwise, the remote debugging simply isn't enabled.

The value of xdebug.remote_host should be where ever you want you Xdebug to connect to. If you're just using a local machine for development, it's usually just "localhost". However, if you're using a virtual machine you may need to make sure that this points to whatever address your virtual machine uses to connect to your own computer. This may vary on the type of virtual machine and the operating system you are using.

The values of xdebug.remote_port should be the same port that your IDE is configured to listen. Make sure, however, that the Xdebug is not configured to use any port that is already being listened by some other software. For example, the default port 9000 used by Xdebug is also the default port of php-fpm. Once you enable the listening on your IDE, you should use the command line to verify that your IDE is actually listening to that port,

For example, you should see something like the following:

$ sudo lsof -nPi|grep 9000
phpstorm  2755          riimu  380u  IPv6 0xd613775cd0f8575f      0t0    TCP *:9000 (LISTEN)

If you are using Xdebug on a remote host rather than the local machine or a local virtual machine, you may be able to enable xdebug.remote_connect_back configuration, which makes Xdebug connect to the connecting machine, instead of using the xdebug.remote_host configuration.

Enable remote debug logging

Now that you have verified that Xdebug is both enabled and hopefully configured properly, but still nothing happens, you should try enabling logging for the remote connections.

Xdebug offers a configuration option xdebug.remote_log to log all remote debugging sessions to a file. If you want to enable logging, make sure this option points to a writable file. If you're using something like /var/log/xdebug.log, make sure you also chmod the file appropriately so that your server can actually write into the file, e.g.

$ sudo touch /var/log/xdebug.log
$ sudo chmod a+rw /var/log/xdebug.log

If you can see Xdebug write into this log file when you reenable remote debugging but nothing happens, you should be able to get hints why it isn't working. For example, if Xdebug cannot connect to the listening IDE, you might see something like this:

Log opened at 2019-02-20 12:46:44
I: Connecting to configured address/port: localhost:9000.
W: Creating socket for 'localhost:9000', poll success, but error: Operation now in progress (19).
W: Creating socket for 'localhost:9000', poll success, but error: Operation now in progress (19).
E: Could not connect to client. :-(
Log closed at 2019-02-20 12:46:44

If the Xdebug cannot connect, again make sure the following:

  • Make sure that xdebug.remote_host points to your machine. Take particular care if you're using virtual machines to use the correct address or hostname for your local machine rather than the virtual machine.
  • Make sure that your IDE is configured properly and listening to the port configured in xdebug.remote_port. Take particular care to ensure that no other application is listening to the same port.
  • Don't forget to also ensure that your server can actually connect your local machine, e.g. make sure the port you are using isn't closed or blocked by firewalls or routers.

Make sure the remote debugging session is started

If you enabled the remote debugging logging and made sure the log file is writable (and ensured these configurations are correct via phpinfo()), but you're not seeing anything in the log file, the remote session itself might not be getting started.

The first thing you can try is to enable to xdebug.remote_autostart configuration. This will make Xdebug always start a remote debugging session regardless of what the browser is telling. This can be easy to way ensure that at least the connection should work, but the session isn't just getting started. Most of the time, however, I would not recommend leaving this option on.

If you can get the remote debugging working by enabling that option, but not without it, the problem may be with your browser. Remote debugging sessions can be started by either setting the URL parameter XDEBUG_SESSION_START or the cookie XDEBUG_SESSION. I highly recommend using an add-on for this purpose. Sometimes, however, the add-ons may be a bit wonky so it may be a good idea to use the network console in your browser to ensure that the appropriate GET/POST/COOKIE variables have been sent in the request.

If you're trying to trigger the remote sessions via GET/POST variable, it may not behave as you expect with AJAX requests. The same can also apply to COOKIE values provided by add-ons.

Configuring break points appropriately

One problem that I've occasionally encountered (particularly with PHPStorm) is that everything else might work correctly, but it doesn't seem like you are reaching the appropriate breakpoints in your code.

There are two distinct common issues that you may be facing.

First is the fact that your IDE does not know how the remote file paths map in your local development environment. This typically happens if you're using virtual machines or remote servers. To fix this, you need to set up appropriate path mappings for your IDE.

In PHPStorm, you can find the setting hidden in "Preferences -> PHP -> Servers -> Use path mappings". Usually, you just need to make sure the project root maps to the project root on the server.

The second issue is that breakpoints may be placed on lines that cannot break. For example, you cannot insert breakpoints into empty lines, because PHP does not generate breakable opcodes into those lines. If your code does not seem to be able to reach your breakpoints, it may be a good idea to set your IDE to break on the first line, e.g. "Run -> Break at first line in PHP Scripts" and proceed from there to see your code behaves.

Debugging multiple connections

Last, but not least, you may occasionally run into scenarios where, for example, a PHP scripts calls another PHP scripts via an URL or calling it externally. By default, IDEs like PHPStorm may be configured to listen to only one connection and debugging session will halt because Xdebug will try to open another debugging connection.

In PHPStorm, this setting is governed by "Preferences -> PHP -> Debug -> Max. simultaneous connections". You will need to set this to more than 1 if you expect your script to make other external calls to other PHP scripts will it is running. This may be particularly typical if you have some kind of micro-services or scheduled tasks.

Be thorough, be certain

I've now gone through some of the most common pitfalls when it comes to configuring Xdebug and setting up the remote debugging. It's a powerful feature, but sometimes it may take a bit of work to get it working. It's always definitely worth it, however. Once you start using a step by step debugger, going back to basic var_dump() methodology seems very unpleasant.

If there is one thing that you should try to remember from this guide, is that always make sure your changes are applied in phpinfo(), use command line tools like lsof to make sure things are behaving as you expect them to and rely on log files to find detailed information.

While this guide mostly applies to Unix like operating systems, using phpinfo() and log files are also incredibly valuable tools on Windows. I know they've helped me to solve a lot of strange issues.

Edit 2019-02-25: Added a reminder to check that the port is not closed, as was pointed out.

title image

Xdebug will skew your performance

Posted on

Xdebug is an absolutely invaluable tool when it comes to PHP programming. Due to portability reasons, I don't usually like to rely too much on extensions that have not been bundled with PHP, but Xdebug is the one extension I will always include in my development environment.

Some of the smaller convenience features provided by Xdebug, like changes to var_dump() output, can be annoying at times (though, all of them can be configured). However, the extension provides three crucial features for developing an application larger than a few thousand lines:

  • Remote debugger
  • Code Coverage
  • Profiler

These great features come at a rather high cost, though. Xdebug also imparts a great overhead, which not only reduces performance, but it does it in an unpredictable manner.

These are not the optimizations you are looking for

Whenever there's an argument on what kind of code is the fastest way to solve problems, people usually like to rely on naive performance tests. Unfortunately, one of the questions I always have to ask first is: Did you run with or without Xdebug? If the performance degradation caused by Xdebug was even, there would be fewer problems, but usually, a piece of code can perform wildly differently on production environment compared to a development environment.

Let us take a look at a simple example. Imagine you have a multi-level array of data that you want to output as JSON. However, you know the array contains DateTime instances and you want to convert those to Unix timestamps instead. On top of my head, I could come up with two different solutions.

The first solution would be to make a simple recursive function like so:

function recurseArray(array $array): array
{
    foreach ($array as $key => $value) {
        if (\is_array($value)) {
            $array[$key] = recurseArray($value);
        } if ($value instanceof DateTimeInterface) {
            $array[$key] = $value->getTimestamp();
        }
    }

    return $array;
}

In this case, PHP also provides a convenient function array_walk_recursive(), which could also do the job:

function walkArray(array $array): array
{
    array_walk_recursive($array, function (& $value) {
        if ($value instanceof DateTimeInterface) {
            $value = $value->getTimestamp();
        }
    });

    return $array;
}

However, I can't quickly tell which of these functions is faster. On one hand, the walkArray() function has a lot of overhead due to numerous function calls, but on the other, it delegates the array recursion to PHP internals.

To test this, one would typically do a quick and naive speed test like the following:

$times = 10;
$testData = array_fill(0, 10000, [[['foo']], 2, new DateTime(), [[[[new DateTime(), [1]]]]]]);

$timer = microtime(true);

for ($i = 0; $i < $times; $i++) {
    $result = walkArray($testData);
}

echo 'Walk:    ' . round((microtime(true) - $timer) * 1000) . "ms\n";

$timer = microtime(true);

for ($i = 0; $i < $times; $i++) {
    $result = recurseArray($testData);
}

echo 'Recurse: ' . round((microtime(true) - $timer) * 1000) . "ms\n";

I run the code and get the following results:

Walk:    1284ms
Recurse: 1285ms

Looks like they're about the same based on this speed test. But wait, did I run this code with Xdebug enabled or not? A quick command line call reveals that I did, in fact, have Xdebug enabled:

$ php -v
PHP 7.2.14 (cli) (built: Jan 12 2019 05:21:04) ( NTS )
Copyright (c) 1997-2018 The PHP Group
Zend Engine v3.2.0, Copyright (c) 1998-2018 Zend Technologies
    with Zend OPcache v7.2.14, Copyright (c) 1999-2018, by Zend Technologies
    with Xdebug v2.6.1, Copyright (c) 2002-2018, by Derick Rethans

So, let's disable the extension and try running the speed test again. This time, we get the following results:

Walk:    550ms
Recurse: 284ms

Now, it looks like the recursion method is actually quite a bit faster.

While this is not the best example, it does illustrate that the performance profile of PHP is completely different based on whether you have Xdebug enabled or not.

You should not, however, stop at simple speed test like the one above. It can give you a good idea what might be faster, but running the code on production data may yield different results. The previous example is based real-world scenario, in which the array_walk_recursive() actually ended up being more efficient in production.

Dealing with performance woes

Despite the fact that Xdebug causes performance problems, it still provides great features, as I said in the beginning. There are several ways you could potentially deal with the problems depending on your use case and setup.

Simply don't enable Xdebug

The simplest solution is to just not enable the extension. Note that it is not enough to just set xdebug.default_enable = 0 in your config. In fact, all that it does is disable traces from errors. The performance issues are caused simply by loading the extension since it needs to hook into various places in the PHP core.

When you actually need to use one of the features from Xdebug, only load the extension on those occasions. To help with this, there are couple handy solutions.

Using a xdebug toggler script

If you're using PHP on a macOS from homebrew, one way to easily toggle Xdebug on and off is to use the xdebug-toggle script:

When you set up Xdebug via PECL, this script effectively just renames ext-xdebug.ini in your PHP installation's conf.d directory to enable and disable the Xdebug extension via command line. It can also handily reboot your apache at the same time.

Setting up the extension in PHPStorm

If you're using PHPStorm, you can also add a separate path to the Xdebug extension in your PHP cli interpreter configuration. This allows PHPStorm to include the extension via command line only when you run scripts via "Debug" or run PHPUnit with code coverage.

Running separate containers with Xdebug

If you happen to have a dockerized PHP development setup, you could alternatively setup a second container with Xdebug enabled. This could allow you to use the step by step remote debugging capabilities without the need of restarting your server. Nginx can be set up to simply redirect the request on the appropriate container based on whether you want to enable remote debugging or not.

You can read a good write up about setting up a setup like that from Juan Treminio in his article Developing at Full Speed with Xdebug.

Avoid opening remote debugging sessions on each request

If any of the aforementioned solutions seem like too much work to set up, I would at least recommend using some kind of tool to manage remote debugging sessions from the browser.

If you enable the configuration xdebug.remote_autostart, Xdebug will always start a remote debugging session and that will needlessly hurt the speed of your application. Managing a debugging session cookie manually isn't very handy either.

Using a browser addon like xdebug-helper can be a great boon in enabling and disabling debugging sessions. That particular extension is available for both Chrome and Firefox.

Whitelisting for code coverage

When you want to generate code coverage, there aren't many alternatives to Xdebug. While PHPUnit also supports code coverage by running your tests using phpdbg -qrr command, it tends to come with its own set of problems (like instability, bugs, and differences in coverage metrics).

Recently, however, a new feature has been added to Xdebug which allows setting up whitelists for code coverage gathering with xdebug_set_filter(). This makes it much faster to run tests with code coverage, as the metrics are only gathered for specific files.

If you're interested, you can read more about this solution from Faster Code Coverage by Sebastian Bergmann and Sebastian Heuer.

Profile with care

Given the performance impact, you might be quick to think that this makes the profiler provided by Xdebug a bit useless. However, it can still provide quite a bit of valuable information if you use it knowing its weaknesses.

The biggest mistake you can make with Xdebug is to rely on only its metrics for speed improvements. If you try to make your code faster only under the profiler, you tend to lean towards micro-optimizations which may not actually give any real-world benefit.

However, the profiler can still provide you with useful insight about which parts are actually taking more time than you expected.

In real-world scenarios, I've often found that profiler tends to highlight two different kinds of errors:

  • Code that was taking much more time than expected due to being called too often
  • Code that was doing something completely unintended and unnecessary.

Especially, if you are unfamiliar with a code base, a profiler can help you locate potential bottlenecks. Things like a function being called 100,000 times or a script making hundreds of database queries are quite easy to spot using a profiler.

As I discussed in my previous blog post, many performance issues tend to be structural in nature and a profiler is a good tool to find those kinds of issues.

Knowing your tools is important

Xdebug is an invaluable tool to any proficient PHP developer. It can still easily lead you astray if you don't fully understand it.

The way I see it:

  • The first step is to learn that Xdebug is a great tool
  • The second step is to understand that it has flaws
  • The third step is understanding how to use it effectively while fully knowing its flaws

Debug and test your performance responsibly.

title image

The Inefficient Architecture

Posted on

Have you ever heard the adage "premature optimization is the root of all evil"? It's quite possible you have and you've also probably been taught you should only optimize actual bottlenecks and leave the optimization until the end.

In the paper that originated the phrase "Structured Programming with go to Statements", Donald Knuth mentions how programmers waste enormous amounts of time worrying about the speed of noncritical parts of their programs. He also adds that while we should worry about the performance of the critical parts, our intuitive guesses about the importance of different parts tend to be incorrect. Thus, we should rely on actual tools to make these judgments.

Sometimes this leads to the extreme position that there is no point in considering the performance impact of anything until you have actual data on the real-world performance. This may, however, end up foregoing all optimization until it is too late.

Time to ship is a woeful optimization goal

Not too long ago, I watched an insightful talk by Konstantin Kudryashov on Min-maxing Software Costs. To me, one of the most important takeaways from that talk was the fact that we tend to optimize the cost of introduction. That is, how fast it is to write new code and ship applications because that is the easiest to optimize and the easiest to measure.

Because of that, we tend to build frameworks and layers that provide convenience by hiding actual implementations and making many operations implicit or lazy. The problem is that we often end up setting up traps for ourselves because we don't either fully understand the underlying mechanics or we simply forget because the frameworks make everything too transparent.

The N+1 problem is the first sign of things to come

A typical problem caused by the convenience provided by numerous different frameworks is the N+1 query problem. If you are unfamiliar with this particular issue, let me give an example:

Let's imagine we have users and each belongs to one organization. We want to print each user and their organization. With a typical ORM/DBAL implementation, we could have code that looks something like this:

$users = $userRepository->findAllUsers();

foreach ($users as $user) {
    echo $user->getName() . ', ' . $user->getOrganization()->getName();
}

The most common problem with the previous example is that while findAllUsers() fetches all the users from the database using a single query, it doesn't fetch the organizations related to each user. The method call getOrganization() is actually a lazy loading function that queries the database for the organization that is related to the user. What ends up happening in practice is that we make one additional query per each individual user. So, the total number of database queries we end up making is 1 + N (where N is the number of users).

The appropriate solution would be to eagerly load all the organizations for each user. This could be done either by using a simple JOIN query or doing a second query with ids for each organization. The best optimal solution depends on the number of users per organization and your preferred framework. Although, in this particular case you may also just want to fetch the name fields specifically without fetching entire entities.

Convenience is the enabler of bad performance

In my honest opinion, the above example shouldn't even be possible. The code should throw an exception because the organizations have not been initialized for the users in the first place. However, because pressure from schedules encourages us to optimize how quickly we can write code, frameworks provide functionality that automatically handles these relationships for us.

You might be inclined to think you would easily catch issues like the above, but real-world examples don't tend to be quite as straightforward. You might have code in another place that fetches database entities, and then somewhere else 20 layers deep, you have another piece of code that uses relations in an unexpected way which triggers additional queries.

The N+1 problem is merely the simplest case to demonstrate that convenience features create performance problems in applications. The core issue of the problem lies in the fact that passing data around different layers of applications is quite difficult. We like to abstract it away behind abstraction layers to make it simpler to reason with but at the same time, we stop thinking about what is actually happening behind the scenes.

I've worked on optimizing several legacy systems that had created massive bottlenecks due to how some application data storage were accessed. In one application, for example, there was a function that was equivalent of Storage::readValue($storageName, $key), which read a single value from the store. However, each time it was called it needed to open and close the external store, but rather than caching the values, each piece of code simply called the static function separately as that was more convenient than figuring out how the data should be passed around different parts of the application.

Be Explicit, Be Performant

Unfortunately, there is no particular silver bullet here. Software architecture is really, really, hard to get right. My own ideology in general, however, is trying to be as explicit as possible in code. When designing frameworks or APIs for libraries, we should do our best not to allow the user to shoot themselves in the foot.

In particular, if a user is doing something that is potentially costly, like accessing IO, we should force the user to be as explicit about it as possible. It should not be possible to just accidentally query a database when accessing an entity relationship. Force the user to think, even for a second, about what they're about to do. This doesn't necessarily mean that APIs should be hard to use. Rather, they should be predictable.

For example, in the previously demonstrated code, the $user->getOrganization() call should not be possible if the organization has not been preloaded for the user. Instead, we should force the user to call something akin to $userRespository->loadOrganization($user).

When you have lazy accessors to database relations, the code becomes surprisingly unpredictable. Simple getters turn into database queries which makes it frustratingly easy to forget, especially for newer developers, that these getters can have a massive performance impact.

Revisiting the topic of premature optimization, these kinds of performance problems are usually created in the initial steps of different projects because little thought is put into how performant different kind of application structures end up being. Once you have a web application that, for example, queries for the same piece of data from the database for couple hundred times in a single request, it can be really hard to fix that kind of structural issues.

I do generally agree that optimization decisions, especially the ones about software architecture, should be based on informed opinions. There is a real danger in getting lost in meaningless micro-optimizations and focusing on the wrong things. However, if your application is inefficient by design, scaling up may become an insurmountable challenge.