Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Segfault when profiling is enabled in php 8.4.1 on Ubuntu #3012

Closed
SanderVerkuil opened this issue Dec 21, 2024 · 18 comments · Fixed by #3019
Closed

[Bug]: Segfault when profiling is enabled in php 8.4.1 on Ubuntu #3012

SanderVerkuil opened this issue Dec 21, 2024 · 18 comments · Fixed by #3019
Labels
🐛 bug Something isn't working

Comments

@SanderVerkuil
Copy link

SanderVerkuil commented Dec 21, 2024

Bug report

When running Symfony 6.4 in PHP8.4 with the profiler enabled, the PHP-FPM process exits with a segfault.

The output of dmesg | grep segfault | tail -10 is:

[354225.435912] dd-ipc-helper[3600702]: segfault at 407cd6 ip 0000000000407cd6 sp 00007a2ba0fff8c8 error 14 in memfd:spawn_worker_trampoline (deleted)[601000+1000] likely on CPU 5 (core 1, socket 0)
[354244.458155] dd-ipc-helper[2489378]: segfault at 407bd6 ip 0000000000407bd6 sp 00007a25d65ffc18 error 14 in memfd:spawn_worker_trampoline (deleted)[601000+1000] likely on CPU 5 (core 1, socket 0)

The application works properly with PHP 8.3, but is just fails with PHP 8.4. When I disable the datadog-profiling.so package, FPM works properly, but the profiling is disabled which we use quite extensively.

If more information is necessary, please let me know and I'll be happy to provide.

PHP version

8.4.1

Tracer or profiler version

1.5.1

Installed extensions

[PHP Modules]
amqp
apcu
bcmath
calendar
Core
ctype
curl
datadog-profiling
date
ddappsec
ddtrace
dom
excimer
exif
FFI
fileinfo
filter
ftp
gd
gettext
hash
iconv
igbinary
intl
json
libxml
mbstring
mysqli
mysqlnd
openssl
pcntl
pcre
PDO
pdo_mysql
Phar
posix
random
readline
redis
Reflection
session
shmop
SimpleXML
sockets
sodium
SPL
standard
sysvmsg
sysvsem
sysvshm
tokenizer
xml
xmlreader
xmlrpc
xmlwriter
xsl
Zend OPcache
zip
zlib

[Zend Modules]
Zend OPcache
datadog-profiling
ddappsec
ddtrace

Output of phpinfo()

{ "date": "2024-12-21T15:56:49Z", "os_name": "Linux legionelladossier-migration-testing 6.8.0-51-generic #52-Ubuntu SMP PREEMPT_DYNAMIC Thu Dec 5 13:09:44 UTC 2024 x86_64", "os_version": "6.8.0-51-generic", "version": "1.5.1", "lang": "php", "lang_version": "8.4.0RC4", "env": "staging", "enabled": true, "service": "LegionellaDossier", "enabled_cli": true, "agent_url": "unix:\/\/\/var\/run\/datadog\/apm.socket", "debug": false, "analytics_enabled": false, "sample_rate": 1, "sampling_rules": [], "tags": [], "service_mapping": [], "distributed_tracing_enabled": true, "dd_version": "1", "architecture": "x86_64", "instrumentation_telemetry_enabled": true, "sapi": "fpm-fcgi", "datadog.trace.sources_path": "\/opt\/datadog\/dd-library\/1.1.0\/dd-trace-sources\/src", "open_basedir_configured": false, "uri_fragment_regex": null, "uri_mapping_incoming": null, "uri_mapping_outgoing": null, "auto_flush_enabled": false, "generate_root_span": true, "http_client_split_by_domain": true, "measure_compile_time": true, "report_hostname_on_root_span": false, "traced_internal_functions": null, "enabled_from_env": true, "opcache.file_cache": null, "sidecar_trace_sender": true }

Image

Upgrading from

I'm upgrading from PHP8.3, and there it worked properly. I used to have PHP8.4.0RC4 installed, but now php 8.4.1 is installed. I did redownload the installer, and ran the installer once more.

@SanderVerkuil SanderVerkuil added the 🐛 bug Something isn't working label Dec 21, 2024
@realFlowControl
Copy link
Member

realFlowControl commented Dec 22, 2024

Hello @SanderVerkuil 👋
Sorry for the issues you are experiencing and thanks for raising this here. Do you by chance have a core file that we could retrieve a stacktrace from?
In case no, could you try run with gdb to help us find the root cause?
This might be helpful information https://docs.datadoghq.com/tracing/trace_collection/automatic_instrumentation/dd_libraries/php/#troubleshooting-an-application-crash

@SanderVerkuil
Copy link
Author

@realFlowControl Thanks for getting back to me.

Running it with valgrind showed me the following:

==2535487== Invalid read of size 8
==2535487==    at 0xD4928C7: read<[u8; 8]> (mod.rs:1215)
==2535487==    by 0xD4928C7: read<[u8; 8]> (mut_ptr.rs:1336)
==2535487==    by 0xD4928C7: len (thin_str.rs:116)
==2535487==    by 0xD4928C7: deref (thin_str.rs:63)
==2535487==    by 0xD4928C7: get_thin_str (string_set.rs:114)
==2535487==    by 0xD4928C7: get_or_insert<datadog_php_profiling::profiling::stack_walking::detail::handle_function_cache_slot::{closure_env#0}> (stack_walking.rs:118)
==2535487==    by 0xD4928C7: handle_function_cache_slot (stack_walking.rs:316)
==2535487==    by 0xD4928C7: collect_call_frame (stack_walking.rs:274)
==2535487==    by 0xD4928C7: {closure#0} (stack_walking.rs:233)
==2535487==    by 0xD4928C7: try_with<core::cell::RefCell<datadog_php_profiling::string_set::StringSet>, datadog_php_profiling::profiling::stack_walking::detail::collect_stack_sample::{closure_env#0}, core::result::Result<alloc::vec::Vec<datadog_php_profiling::profiling::stack_walking::ZendFrame, alloc::alloc::Global>, core::str::error::Utf8Error>> (local.rs:270)
==2535487==    by 0xD4928C7: with<core::cell::RefCell<datadog_php_profiling::string_set::StringSet>, datadog_php_profiling::profiling::stack_walking::detail::collect_stack_sample::{closure_env#0}, core::result::Result<alloc::vec::Vec<datadog_php_profiling::profiling::stack_walking::ZendFrame, alloc::alloc::Global>, core::str::error::Utf8Error>> (local.rs:246)
==2535487==    by 0xD4928C7: datadog_php_profiling::profiling::stack_walking::detail::collect_stack_sample (stack_walking.rs:202)
==2535487==    by 0xD49D8EE: collect_allocations (mod.rs:809)
==2535487==    by 0xD49D8EE: datadog_php_profiling::allocation::AllocationProfilingStats::track_allocation (allocation.rs:101)
==2535487==    by 0xD49E28D: {closure#0} (allocation.rs:397)
==2535487==    by 0xD49E28D: try_with<core::cell::RefCell<datadog_php_profiling::allocation::AllocationProfilingStats>, datadog_php_profiling::allocation::alloc_prof_malloc::{closure_env#0}, ()> (local.rs:270)
==2535487==    by 0xD49E28D: with<core::cell::RefCell<datadog_php_profiling::allocation::AllocationProfilingStats>, datadog_php_profiling::allocation::alloc_prof_malloc::{closure_env#0}, ()> (local.rs:246)
==2535487==    by 0xD49E28D: datadog_php_profiling::allocation::alloc_prof_malloc (allocation.rs:395)
==2535487==    by 0x41F707: zend_arena_create (zend_arena.h:36)
==2535487==    by 0x41F707: zend_optimize_script (zend_optimizer.c:1618)
==2535487==    by 0x833A36D: ??? (in /usr/lib/php/20240924/opcache.so)
==2535487==    by 0xC552719: ??? (resolver.c:35)
==2535487==    by 0x4F25D0: compile_filename (zend_language_scanner.c:706)
==2535487==    by 0x481D0E: zend_include_or_eval (zend_execute.c:5135)
==2535487==    by 0x49057F: ZEND_INCLUDE_OR_EVAL_SPEC_OBSERVER_HANDLER (zend_vm_execute.h:5319)
==2535487==    by 0x4BD6E4: execute_ex (zend_vm_execute.h:59318)
==2535487==    by 0x46A475: zend_call_function (zend_execute_API.c:996)
==2535487==    by 0x46A84C: zend_call_known_function (zend_execute_API.c:1090)
==2535487==  Address 0x3 is not stack'd, malloc'd or (recently) free'd
==2535487==
==2535487==
==2535487== Process terminating with default action of signal 6 (SIGABRT)
==2535487==    at 0x55D4B1C: __pthread_kill_implementation (pthread_kill.c:44)
==2535487==    by 0x55D4B1C: __pthread_kill_internal (pthread_kill.c:78)
==2535487==    by 0x55D4B1C: pthread_kill@@GLIBC_2.34 (pthread_kill.c:89)
==2535487==    by 0x557B26D: raise (raise.c:26)
==2535487==    by 0x555E8FE: abort (abort.c:79)
==2535487==    by 0xCA907E6: ??? (lib.rs:43)
==2535487==    by 0xCA907D5: ??? (lib.rs:38)
==2535487==    by 0xCC72A05: ??? (panicking.rs:831)
==2535487==    by 0xCC796B2: ??? (panicking.rs:801)
==2535487==    by 0xCC7933F: ??? (panicking.rs:657)
==2535487==    by 0xCC79295: ??? (backtrace.rs:171)
==2535487==    by 0xCC7928E: ??? (panicking.rs:645)
==2535487==    by 0xC4F88F3: ??? (panicking.rs:72)
==2535487==    by 0xC4F8E61: ??? (result.rs:1649)
==2535487==
thread '<unnamed>' panicked at libdatadog/crashtracker/src/collector/emitters.rs:56:37:
called `Result::unwrap()` on an `Err` value: Os { code: 32, kind: BrokenPipe, message: "Broken pipe" }
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Not sure what causes this, but when I disable the profile.so I see no errors.

@morrisonlevi
Copy link
Collaborator

morrisonlevi commented Dec 23, 2024

Thank you for the report. I noticed that excimer is also loaded. If you load profiling and not excimer, do you have issues? Sometimes these intrusive tools can mess with each other--I'm not aware of any current issues but it's happened some times in the past that two profilers/tracers/debuggers/observer things conflict.

@SanderVerkuil
Copy link
Author

Unfortunately disabling excimer didn't work. So at least it's not a conflict between those tools.

I did see it happening in our Docker container as well, so I might be able to reproduce it locally as well. I could also see whether I can create a demo repository where this issue can be reproduced, though with the holidays coming up I might not have time for that unfortunately.

@morrisonlevi
Copy link
Collaborator

morrisonlevi commented Jan 6, 2025

@SanderVerkuil Based on the stack trace, I think if you disable allocation profiling that it may not crash anymore. There is a config mode you use to set the INI:

php datadog-setup.php config set \
    -d datadog.profiling.allocation_enabled=0 \
    --php-bin all # or whatever binaries you want

I am continuing to look at the data you provided to try and evaluate the root cause. If you continue to have crashes after disabling memory allocation profiling, I'd like the stack traces! If you can create it with a demo repo, that would be even more helpful, of course!

@SanderVerkuil
Copy link
Author

@morrisonlevi Thanks for the response, but unfortunately the allocation_enabled setting was already set to 0 (if that's the default). Setting it to 0 also didn't solve it.

I did see something related to Broken pipe in the traces, might be something similar to #2991, which causes a segfault when running it in php 8.4?

@morrisonlevi
Copy link
Collaborator

morrisonlevi commented Jan 6, 2025

The default is not 0. If setting it to zero doesn't solve the crash, I'd very much like stack traces from each thread! The high level description of the stack trace you posted above is:

  1. PHP compiles a file
  2. Zend Optimizer optimizes the script, which creates an arena
  3. Allocation sampling occurs
  4. While walking the stack for allocation sampling, we find a non-zero cache line item (0x3) for the function name and try to dereference it, which crashes.

So if it fails without allocation profiling, the stack will be different and hopefully illuminating.

@SanderVerkuil
Copy link
Author

SanderVerkuil commented Jan 6, 2025

Ah alright, I've disabled the profiling, and running it with Valgrind (which was the easiest to get a stacktrace, hopefully it is sufficient), I get the following stacktrace:

[06-Jan-2025 21:09:35] NOTICE: fpm is running, pid 3015728
[06-Jan-2025 21:09:37] NOTICE: ready to handle connections
[06-Jan-2025 21:09:37] NOTICE: systemd monitor interval set to 10000ms
==3015783== Invalid read of size 8
==3015783==    at 0xD6928C7: read<[u8; 8]> (mod.rs:1215)
==3015783==    by 0xD6928C7: read<[u8; 8]> (mut_ptr.rs:1336)
==3015783==    by 0xD6928C7: len (thin_str.rs:116)
==3015783==    by 0xD6928C7: deref (thin_str.rs:63)
==3015783==    by 0xD6928C7: get_thin_str (string_set.rs:114)
==3015783==    by 0xD6928C7: get_or_insert<datadog_php_profiling::profiling::stack_walking::detail::handle_function_cache_slot::{closure_env#0}> (stack_walking.rs:118)
==3015783==    by 0xD6928C7: handle_function_cache_slot (stack_walking.rs:316)
==3015783==    by 0xD6928C7: collect_call_frame (stack_walking.rs:274)
==3015783==    by 0xD6928C7: {closure#0} (stack_walking.rs:233)
==3015783==    by 0xD6928C7: try_with<core::cell::RefCell<datadog_php_profiling::string_set::StringSet>, datadog_php_profiling::profiling::stack_walking::detail::collect_stack_sample::{closure_env#0}, core::result::Result<alloc::vec::Vec<datadog_php_profiling::profiling::stack_walking::ZendFrame, alloc::alloc::Global>, core::str::error::Utf8Error>> (local.rs:270)
==3015783==    by 0xD6928C7: with<core::cell::RefCell<datadog_php_profiling::string_set::StringSet>, datadog_php_profiling::profiling::stack_walking::detail::collect_stack_sample::{closure_env#0}, core::result::Result<alloc::vec::Vec<datadog_php_profiling::profiling::stack_walking::ZendFrame, alloc::alloc::Global>, core::str::error::Utf8Error>> (local.rs:246)
==3015783==    by 0xD6928C7: datadog_php_profiling::profiling::stack_walking::detail::collect_stack_sample (stack_walking.rs:202)
==3015783==    by 0xD69CDEA: collect_time (mod.rs:755)
==3015783==    by 0xD69CDEA: {closure#0} (wall_time.rs:118)
==3015783==    by 0xD69CDEA: try_with<core::cell::RefCell<datadog_php_profiling::RequestLocals>, datadog_php_profiling::wall_time::ddog_php_prof_interrupt_function::{closure_env#0}, ()> (local.rs:270)
==3015783==    by 0xD69CDEA: with<core::cell::RefCell<datadog_php_profiling::RequestLocals>, datadog_php_profiling::wall_time::ddog_php_prof_interrupt_function::{closure_env#0}, ()> (local.rs:246)
==3015783==    by 0xD69CDEA: ddog_php_prof_interrupt_function (wall_time.rs:94)
==3015783==    by 0xD69D529: ddog_php_prof_interrupt_function_wrapper (wall_time.rs:127)
==3015783==    by 0x46FEBF: zend_interrupt_helper_SPEC (zend_vm_execute.h:3879)
==3015783==    by 0x4BD6E4: execute_ex (zend_vm_execute.h:59318)
==3015783==    by 0x46A475: zend_call_function (zend_execute_API.c:996)
==3015783==    by 0x46A84C: zend_call_known_function (zend_execute_API.c:1090)
==3015783==    by 0x2FE0A4: spl_perform_autoload (php_spl.c:442)
==3015783==    by 0x4695B2: zend_lookup_class_ex (zend_execute_API.c:1260)
==3015783==    by 0x43FB77: zif_class_alias (zend_builtin_functions.c:1201)
==3015783==    by 0x22A0EA: ZEND_DO_ICALL_SPEC_OBSERVER_HANDLER (zend_vm_execute.h:1416)
==3015783==    by 0x22B935: execute_ex.cold (zend_vm_execute.h:58814)
==3015783==  Address 0x3 is not stack'd, malloc'd or (recently) free'd
==3015783==
==3015783==
==3015783== Process terminating with default action of signal 6 (SIGABRT)
==3015783==    at 0x55D4B1C: __pthread_kill_implementation (pthread_kill.c:44)
==3015783==    by 0x55D4B1C: __pthread_kill_internal (pthread_kill.c:78)
==3015783==    by 0x55D4B1C: pthread_kill@@GLIBC_2.34 (pthread_kill.c:89)
==3015783==    by 0x557B26D: raise (raise.c:26)
==3015783==    by 0x555E8FE: abort (abort.c:79)
==3015783==    by 0xCC907E6: ??? (lib.rs:43)
==3015783==    by 0xCC907D5: ??? (lib.rs:38)
==3015783==    by 0xCE72A05: ??? (panicking.rs:831)
==3015783==    by 0xCE796B2: ??? (panicking.rs:801)
==3015783==    by 0xCE7933F: ??? (panicking.rs:657)
==3015783==    by 0xCE79295: ??? (backtrace.rs:171)
==3015783==    by 0xCE7928E: ??? (panicking.rs:645)
==3015783==    by 0xC6F88F3: ??? (panicking.rs:72)
==3015783==    by 0xC6F8E61: ??? (result.rs:1649)
==3015783==
thread '<unnamed>' panicked at libdatadog/crashtracker/src/collector/emitters.rs:56:37:
called `Result::unwrap()` on an `Err` value: Os { code: 32, kind: BrokenPipe, message: "Broken pipe" }
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
==3015783== HEAP SUMMARY:
==3015783==     in use at exit: 130,892,794 bytes in 99,641 blocks
==3015783==   total heap usage: 176,076 allocs, 76,435 frees, 172,496,251 bytes allocated
==3015783==
==3015783== LEAK SUMMARY:
==3015783==    definitely lost: 38,856 bytes in 1,210 blocks
==3015783==    indirectly lost: 2,430 bytes in 20 blocks
==3015783==      possibly lost: 4,006,484 bytes in 28,439 blocks
==3015783==    still reachable: 126,845,024 bytes in 69,972 blocks
==3015783==         suppressed: 0 bytes in 0 blocks
==3015783== Rerun with --leak-check=full to see details of leaked memory
==3015783==
==3015783== For lists of detected and suppressed errors, rerun with: -s
==3015783== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
==3015783== could not unlink /tmp/vgdb-pipe-from-vgdb-to-3015783-by-???-on-???
==3015783== could not unlink /tmp/vgdb-pipe-to-vgdb-from-3015783-by-???-on-???
==3015783== could not unlink /tmp/vgdb-pipe-shared-mem-vgdb-3015783-by-???-on-???
[06-Jan-2025 21:09:54] WARNING: [pool www] child 3015783 exited on signal 6 (SIGABRT - core dumped) after 19.278614 seconds from start

This was with ZEND_ALLOC=0, without that I get the following:

[06-Jan-2025 21:11:51] NOTICE: fpm is running, pid 3017204
[06-Jan-2025 21:11:53] NOTICE: ready to handle connections
[06-Jan-2025 21:11:53] NOTICE: systemd monitor interval set to 10000ms
==3017259== Invalid read of size 8
==3017259==    at 0xD6928C7: read<[u8; 8]> (mod.rs:1215)
==3017259==    by 0xD6928C7: read<[u8; 8]> (mut_ptr.rs:1336)
==3017259==    by 0xD6928C7: len (thin_str.rs:116)
==3017259==    by 0xD6928C7: deref (thin_str.rs:63)
==3017259==    by 0xD6928C7: get_thin_str (string_set.rs:114)
==3017259==    by 0xD6928C7: get_or_insert<datadog_php_profiling::profiling::stack_walking::detail::handle_function_cache_slot::{closure_env#0}> (stack_walking.rs:118)
==3017259==    by 0xD6928C7: handle_function_cache_slot (stack_walking.rs:316)
==3017259==    by 0xD6928C7: collect_call_frame (stack_walking.rs:274)
==3017259==    by 0xD6928C7: {closure#0} (stack_walking.rs:233)
==3017259==    by 0xD6928C7: try_with<core::cell::RefCell<datadog_php_profiling::string_set::StringSet>, datadog_php_profiling::profiling::stack_walking::detail::collect_stack_sample::{closure_env#0}, core::result::Result<alloc::vec::Vec<datadog_php_profiling::profiling::stack_walking::ZendFrame, alloc::alloc::Global>, core::str::error::Utf8Error>> (local.rs:270)
==3017259==    by 0xD6928C7: with<core::cell::RefCell<datadog_php_profiling::string_set::StringSet>, datadog_php_profiling::profiling::stack_walking::detail::collect_stack_sample::{closure_env#0}, core::result::Result<alloc::vec::Vec<datadog_php_profiling::profiling::stack_walking::ZendFrame, alloc::alloc::Global>, core::str::error::Utf8Error>> (local.rs:246)
==3017259==    by 0xD6928C7: datadog_php_profiling::profiling::stack_walking::detail::collect_stack_sample (stack_walking.rs:202)
==3017259==    by 0xD69CDEA: collect_time (mod.rs:755)
==3017259==    by 0xD69CDEA: {closure#0} (wall_time.rs:118)
==3017259==    by 0xD69CDEA: try_with<core::cell::RefCell<datadog_php_profiling::RequestLocals>, datadog_php_profiling::wall_time::ddog_php_prof_interrupt_function::{closure_env#0}, ()> (local.rs:270)
==3017259==    by 0xD69CDEA: with<core::cell::RefCell<datadog_php_profiling::RequestLocals>, datadog_php_profiling::wall_time::ddog_php_prof_interrupt_function::{closure_env#0}, ()> (local.rs:246)
==3017259==    by 0xD69CDEA: ddog_php_prof_interrupt_function (wall_time.rs:94)
==3017259==    by 0xD69D529: ddog_php_prof_interrupt_function_wrapper (wall_time.rs:127)
==3017259==    by 0x46FEBF: zend_interrupt_helper_SPEC (zend_vm_execute.h:3879)
==3017259==    by 0x4BD6E4: execute_ex (zend_vm_execute.h:59318)
==3017259==    by 0x46A475: zend_call_function (zend_execute_API.c:996)
==3017259==    by 0x46A84C: zend_call_known_function (zend_execute_API.c:1090)
==3017259==    by 0x2FE0A4: spl_perform_autoload (php_spl.c:442)
==3017259==    by 0x4695B2: zend_lookup_class_ex (zend_execute_API.c:1260)
==3017259==    by 0x43FB77: zif_class_alias (zend_builtin_functions.c:1201)
==3017259==    by 0x22A0EA: ZEND_DO_ICALL_SPEC_OBSERVER_HANDLER (zend_vm_execute.h:1416)
==3017259==    by 0x22B935: execute_ex.cold (zend_vm_execute.h:58814)
==3017259==  Address 0x3 is not stack'd, malloc'd or (recently) free'd
==3017259==
==3017259==
==3017259== Process terminating with default action of signal 6 (SIGABRT)
==3017259==    at 0x55D4B1C: __pthread_kill_implementation (pthread_kill.c:44)
==3017259==    by 0x55D4B1C: __pthread_kill_internal (pthread_kill.c:78)
==3017259==    by 0x55D4B1C: pthread_kill@@GLIBC_2.34 (pthread_kill.c:89)
==3017259==    by 0x557B26D: raise (raise.c:26)
==3017259==    by 0x555E8FE: abort (abort.c:79)
==3017259==    by 0xCC907E6: ??? (lib.rs:43)
==3017259==    by 0xCC907D5: ??? (lib.rs:38)
==3017259==    by 0xCE72A05: ??? (panicking.rs:831)
==3017259==    by 0xCE796B2: ??? (panicking.rs:801)
==3017259==    by 0xCE7933F: ??? (panicking.rs:657)
==3017259==    by 0xCE79295: ??? (backtrace.rs:171)
==3017259==    by 0xCE7928E: ??? (panicking.rs:645)
==3017259==    by 0xC6F88F3: ??? (panicking.rs:72)
==3017259==    by 0xC6F8E61: ??? (result.rs:1649)
==3017259==
thread '<unnamed>' panicked at libdatadog/crashtracker/src/collector/emitters.rs:89:28:
called `Result::unwrap()` on an `Err` value: Os { code: 32, kind: BrokenPipe, message: "Broken pipe" }
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
==3017259== HEAP SUMMARY:
==3017259==     in use at exit: 113,086,984 bytes in 94,454 blocks
==3017259==   total heap usage: 163,683 allocs, 69,229 frees, 149,732,288 bytes allocated
==3017259==
==3017259== LEAK SUMMARY:
==3017259==    definitely lost: 38,920 bytes in 1,212 blocks
==3017259==    indirectly lost: 2,430 bytes in 20 blocks
==3017259==      possibly lost: 4,007,428 bytes in 28,457 blocks
==3017259==    still reachable: 109,038,206 bytes in 64,765 blocks
==3017259==         suppressed: 0 bytes in 0 blocks
==3017259== Rerun with --leak-check=full to see details of leaked memory
==3017259==
==3017259== For lists of detected and suppressed errors, rerun with: -s
==3017259== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
==3017259== could not unlink /tmp/vgdb-pipe-from-vgdb-to-3017259-by-???-on-???
==3017259== could not unlink /tmp/vgdb-pipe-to-vgdb-from-3017259-by-???-on-???
==3017259== could not unlink /tmp/vgdb-pipe-shared-mem-vgdb-3017259-by-???-on-???

Not sure which of these is easier to read/follow.

I've just updated PHP to 8.4.2, unfortunately the issue exists there as well.

@morrisonlevi
Copy link
Collaborator

I am able to reproduce this somewhat reliably now and am investigating.

@SanderVerkuil
Copy link
Author

That's great to hear! If you need any more information, or any more help, please do let me know, I'll be happy to help.

@SanderVerkuil
Copy link
Author

@morrisonlevi I see that you've created an MR, and I'd be happy to test it. Is there a way for me to compile the PR and test it on my environment?

@realFlowControl
Copy link
Member

Hey @SanderVerkuil,
you can download the artifact from CI from here. The datadog-setup.php from that list fetches the binary from that CI run.

@SanderVerkuil
Copy link
Author

Thanks @realFlowControl ! I managed to install it.

@morrisonlevi After using that version, the initial page was loading properly, but after one request I got the following stacktrace. So I did get a bit further, but there are still some small issues unfortunately.

==3614498== Invalid read of size 8
==3614498==    at 0xD69A3EE: read<[u8; 8]> (mod.rs:1284)
==3614498==    by 0xD69A3EE: read<[u8; 8]> (mut_ptr.rs:1340)
==3614498==    by 0xD69A3EE: len (thin_str.rs:116)
==3614498==    by 0xD69A3EE: deref (thin_str.rs:63)
==3614498==    by 0xD69A3EE: get_thin_str (string_set.rs:114)
==3614498==    by 0xD69A3EE: get_or_insert<datadog_php_profiling::profiling::stack_walking::detail::handle_function_cache_slot::{closure_env#0}> (stack_walking.rs:118)
==3614498==    by 0xD69A3EE: handle_function_cache_slot (stack_walking.rs:316)
==3614498==    by 0xD69A3EE: collect_call_frame (stack_walking.rs:274)
==3614498==    by 0xD69A3EE: {closure#0} (stack_walking.rs:233)
==3614498==    by 0xD69A3EE: try_with<core::cell::RefCell<datadog_php_profiling::string_set::StringSet>, datadog_php_profiling::profiling::stack_walking::detail::collect_stack_sample::{closure_env#0}, core::result::Result<alloc::vec::Vec<datadog_php_profiling::profiling::stack_walking::ZendFrame, alloc::alloc::Global>, core::str::error::Utf8Error>> (local.rs:284)
==3614498==    by 0xD69A3EE: with<core::cell::RefCell<datadog_php_profiling::string_set::StringSet>, datadog_php_profiling::profiling::stack_walking::detail::collect_stack_sample::{closure_env#0}, core::result::Result<alloc::vec::Vec<datadog_php_profiling::profiling::stack_walking::ZendFrame, alloc::alloc::Global>, core::str::error::Utf8Error>> (local.rs:260)
==3614498==    by 0xD69A3EE: datadog_php_profiling::profiling::stack_walking::detail::collect_stack_sample (stack_walking.rs:202)
==3614498==    by 0xD69C81D: collect_time (mod.rs:755)
==3614498==    by 0xD69C81D: {closure#0} (wall_time.rs:118)
==3614498==    by 0xD69C81D: try_with<core::cell::RefCell<datadog_php_profiling::RequestLocals>, datadog_php_profiling::wall_time::ddog_php_prof_interrupt_function::{closure_env#0}, ()> (local.rs:284)
==3614498==    by 0xD69C81D: with<core::cell::RefCell<datadog_php_profiling::RequestLocals>, datadog_php_profiling::wall_time::ddog_php_prof_interrupt_function::{closure_env#0}, ()> (local.rs:260)
==3614498==    by 0xD69C81D: ddog_php_prof_interrupt_function (wall_time.rs:94)
==3614498==    by 0xC7A69DA: ??? (uhook_legacy.c:229)
==3614498==    by 0xC759BF4: ??? (hook.c:1108)
==3614498==    by 0xC74C57D: ??? (interceptor.c:56)
==3614498==    by 0xC74C6A2: ??? (interceptor.c:176)
==3614498==    by 0x509AED: call_end_observers (zend_observer.c:309)
==3614498==    by 0x509AED: zend_observer_fcall_end_prechecked (zend_observer.c:315)
==3614498==    by 0x22A92D: zend_observer_fcall_end (zend_observer.h:126)
==3614498==    by 0x22A92D: ZEND_DO_FCALL_SPEC_OBSERVER_HANDLER (zend_vm_execute.h:2161)
==3614498==    by 0x22B8D7: execute_ex.cold (zend_vm_execute.h:58859)
==3614498==    by 0x46A405: zend_call_function (zend_execute_API.c:996)
==3614498==    by 0x46A771: zend_call_known_function (zend_execute_API.c:1090)
==3614498==    by 0x4EC86C: zend_call_known_instance_method (zend_API.h:860)
==3614498==    by 0x4EC86C: zend_call_known_instance_method_with_0_params (zend_API.h:866)
==3614498==    by 0x4EC86C: zend_user_it_new_iterator (zend_interfaces.c:93)
==3614498==  Address 0x3 is not stack'd, malloc'd or (recently) free'd
==3614498==
thread '<unnamed>' panicked at libdatadog/crashtracker/src/collector/emitters.rs:56:37:
called `Result::unwrap()` on an `Err` value: Os { code: 32, kind: BrokenPipe, message: "Broken pipe" }
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
==3614498==
==3614498== Process terminating with default action of signal 6 (SIGABRT)
==3614498==    at 0x55D4B1C: __pthread_kill_implementation (pthread_kill.c:44)
==3614498==    by 0x55D4B1C: __pthread_kill_internal (pthread_kill.c:78)
==3614498==    by 0x55D4B1C: pthread_kill@@GLIBC_2.34 (pthread_kill.c:89)
==3614498==    by 0x557B26D: raise (raise.c:26)
==3614498==    by 0x555E8FE: abort (abort.c:79)
==3614498==    by 0xCC85CC6: ??? (lib.rs:48)
==3614498==    by 0xCC85CB5: ??? (lib.rs:43)
==3614498==    by 0xCE52045: ??? (panicking.rs:831)
==3614498==    by 0xCE58772: ??? (panicking.rs:801)
==3614498==    by 0xCE583FF: ??? (panicking.rs:657)
==3614498==    by 0xCE58355: ??? (backtrace.rs:171)
==3614498==    by 0xCE58342: ??? (panicking.rs:645)
==3614498==    by 0xC6F8493: ??? (panicking.rs:72)
==3614498==    by 0xC6F88B1: ??? (result.rs:1654)
==3614498==
==3614498== HEAP SUMMARY:
==3614498==     in use at exit: 161,490,350 bytes in 118,942 blocks
==3614498==   total heap usage: 358,119 allocs, 239,177 frees, 216,216,956 bytes allocated
==3614498==
==3614498== LEAK SUMMARY:
==3614498==    definitely lost: 38,944 bytes in 1,210 blocks
==3614498==    indirectly lost: 3,488 bytes in 36 blocks
==3614498==      possibly lost: 4,014,150 bytes in 27,895 blocks
==3614498==    still reachable: 157,433,768 bytes in 89,801 blocks
==3614498==         suppressed: 0 bytes in 0 blocks
==3614498== Rerun with --leak-check=full to see details of leaked memory
==3614498==
==3614498== For lists of detected and suppressed errors, rerun with: -s
==3614498== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
==3614498== could not unlink /tmp/vgdb-pipe-from-vgdb-to-3614498-by-???-on-???
==3614498== could not unlink /tmp/vgdb-pipe-to-vgdb-from-3614498-by-???-on-???
==3614498== could not unlink /tmp/vgdb-pipe-shared-mem-vgdb-3614498-by-???-on-???
[07-Jan-2025 13:37:18] WARNING: [pool www] child 3614498 exited on signal 6 (SIGABRT - core dumped) after 203.815838 seconds from start

@realFlowControl
Copy link
Member

Hey @SanderVerkuil,
would you mind trying again from a new CI run, which includes this change ccae26d?

@morrisonlevi
Copy link
Collaborator

The linked PR fixes an issue on PHP 8.4 which I believe you were hitting. I will work to get a release out soon. Let me know if this does not fix your issue.

@SanderVerkuil
Copy link
Author

@morrisonlevi Splendid! That did it. Thanks for the help everyone!

@realFlowControl
Copy link
Member

Hey @SanderVerkuil 👋
thanks for reporting this issue, the fix for this (#3019) has been released with v1.6.0 just now 🎉

@morrisonlevi
Copy link
Collaborator

morrisonlevi commented Jan 17, 2025

FYI, @SanderVerkuil, other fixes for PHP 8.4 were shipped today in v1.6.3. You should upgrade!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐛 bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants