Hotel Data Automation for Delphi (Salesforce)

Migrating from CSVs to a resilient data pipeline for Amadeus Delphi (Salesforce). Success story of automated hotels onboarding for ML benchmarking.

Picture for Hotel Data Automation for Delphi (Salesforce)
Picture by Matt Chad
Author's photo Matt Chad
December 24, 2025

Amadeus Delphi is the backbone of sales and catering for thousands of hotels. It is built on Salesforce, but it isn’t a standard Salesforce implementation. It’s a complex, often undocumented environment that stores the exact data a Machine Learning (ML) engine needs to optimize hotel operations.

I recently led a project for a client whose ML system provides cost benchmarking for hotel owners. Their system was powerful, but their onboarding was broken. They relied on hotel staff manually exporting CSVs from Delphi.

CSVs are a great starting point for a prototype, but then you have to scale with a proper API implementation.

We’ve replaced that manual import with a resilient automated data pipeline using Symfony, RabbitMQ, and the Salesforce Bulk API 2.0.

Here is how we architected a system that turned a manual bottleneck into a 10x growth engine.

The Core Technical Stack

We needed an architecture that could handle the initial sync—one year of historical data and one year of future bookings. Salesforce enforces data throttling limits on requests to its API which we had to respect.

Salesforce Bulk API 2.0

While most developers reach for the standard Salesforce REST API, it’s the wrong tool for this job. Pulling two years of room nights record-by-record would have eaten through the 10,000 requests-per-day limit in minutes.

We’ve used Bulk API 2.0. It’s designed for high-volume data. You submit a SOQL query, Salesforce processes it on their end, and then provides a flat CSV stream. Using Symfony’s HttpClient, we could process these large files efficiently, mapping them directly to Doctrine Entities without loading the entire dataset into memory.

Async Processing with Symfony Messenger & RabbitMQ

Integrations like this shouldn’t run in the request-response cycle.

We’ve used Symfony Messenger backed by RabbitMQ.

One scheduler is requesting data fetch for each hotel account added to our app.

A background worker then handles the OAuth handshake, the Bulk API job creation, and the data ingestion. This setup also gives us automatic retries. If Salesforce returns a 503 or a network timeout occurs, the message goes back to the queue for a retry, ensuring we never miss a booking update.

Scalable Architecture: Docker & Symfony Messenger

To handle the scale of syncing years of historical reservation data, we needed a resilient infrastructure. We containerized the entire application using Docker, separating the web-facing Symfony app from the background workers to ensure scalability.

Categorized Worker Queues

One of the most important architectural decisions was how we handled memory.

Salesforce Bulk data is unpredictable. A single query might return ten rows or ten thousand.

To prevent resource exhaustion, we categorized our Symfony Messenger workers into three distinct priorities:

  • async_high_priority: For tasks requiring predictable memory and fast execution.
  • async_low_priority: For standard, non-urgent background tasks and cleanup.
  • async_bulk: Specifically for processing batched data. These workers have higher memory limits (up to 1024 M) to handle large CSV streams from Salesforce without crashing.

The Sync Lifecycle: A Multi-Stage ETL Flow

Using Salesforce Bulk API 2.0 meant we couldn’t just fetch and save. The API works asynchronously on Salesforce’s side, so we built a multi-stage pipeline:

  1. RequestQueryJob: A scheduled message (running every 3 hours) sends a SOQL query to Salesforce to find updates.
  2. MonitorQueryJob: A worker polls Salesforce to check the job status. If it’s still processing, we throw a recoverable exception to delay the message and try again later.
  3. ProcessQueryJob: Once complete, we pull the results.
  4. Data Ingestion: We stream the CSV results into a generic SObject table in our database.
  5. Transformation: Finally, a TransformEventsMessage takes that raw data and maps it into our normalized entities (SeerioEvent, SeerioReservationDay).

Solving the Delphi Documentation Gap

The biggest challenge was the lack of public documentation for Delphi custom objects. All custom objects in Delphi are prefixed with the nihrm__ namespace. Finding where the real reservation data lives is a game of trial and error.

To get through this, I’ve built a custom SOQL console inside our Symfony admin panel (restricted to superadmins). This allowed us to query the hotel’s live Salesforce instance in real-time to identify the correct fields. We had to map several specific objects to get a full picture of a hotel’s performance:

  • nihrm__Booking__c: The high-level event or reservation.
  • nihrm__RoomBlock__c: The groups of rooms held for specific dates.
  • nihrm__BookingRoomNight__c: The granular daily data points (rates and availability).
  • nihrm__FunctionRoom__c: Data regarding the physical spaces used.

Optimizing for Efficiency and Limits

Salesforce limits are the primary constraint. Initially, we tried pulling child objects (Room Nights) separately from the parent Bookings. However, we hit the daily API request limits pretty fast because of the N+1 problem.

The fix was to use nested SOQL queries. By structuring the query to fetch the parent booking and its related room nights in one pass, we reduced our API footprint to basically two requests.

Filtering for records modified in the last 24 hours, our daily syncs now run in minutes, leaving the majority of the hotel’s API quota untouched.

Security: PKCE and OAuth2 in Symfony

For enterprise-grade Salesforce integrations, PKCE (Proof Key for Code Exchange) is the modern requirement.

The standard knpuniversity/oauth2-client-bundle is a great starting point, but it doesn’t handle PKCE out of the box for Salesforce.

I had to modify the provider to handle the code challenge and verifier generation. This ensures that even if an authorization code is intercepted, it’s useless to an attacker. For a client handling sensitive hotel financial data, this level of security was a non-negotiable requirement.

Challenges with Multiple Hotel Properties

Halfway through development, we hit the reality of the hospitality business: one Salesforce login often controls multiple hotels. A management company might have 20 properties under one account.

Our system had to be adjusted to handle this. After the initial OAuth connection, we pull a list of all nihrm__Property__c records and group bookings for them. This prevents data leakage and ensures the benchmarking engine is comparing apples to apples.

The Result

The impact was immediate.

The client went from a manual onboarding process to a near-automated one.

  • Onboarding Scale: They successfully onboarded 10 new customers in the first few weeks. That was impossible with the old CSV-upload method.
  • Data Reliability: The ML system now receives updates multiple times a day, making its hospitality intelligence far more accurate.
  • Resilience: Thanks to RabbitMQ, the system handles Salesforce’s occasional API hiccups without dropping a single record.

Need to Automate Your Salesforce Integration?

Building a bridge between Salesforce and a proprietary system requires more than an API key. You need to understand custom nihrm__ schema, worker memory management, and secure token handling.

I specialize in building efficient data pipelines for Magento, Laravel, and Symfony. If you need to extract your data from Salesforce, let’s talk.

Consulting avatar

Do you need assistance with your Magento 2 store?

Share this post