Hey Mira,
There are so many ways that you can recreate that shoot.
You could use your phone and the gimbal that you have, but you would need to “pan/tilt” the phone whilst moving it.
As your example is partly in a tight space, and it is out of Germany, I would lean towards using a camera slider.
Using a mounted SLR (Canon/Panasonic/Sony/Nikon) or small Blackmagic/Canon/Sony video camera with the right lense (wide/macro, but not fish-eye, although that can be fixed in post).
Use a Tripod fluid head + plan out your set of shots.
If you look carefully, you will notice on several shots that the camera moves (tracks) in a forward direction, whilst the camera (imaging part, the lens) is panning in the opposite direction.
You also need to take note of the composition of the shot, and distance to the subject.
For extra control, you could use a motorised slider with a controlled panhead, either real-time or time-lapse.
Hope that this help you get closer to your desired result.
Atb
Mads