Best way to update tables from CSV

Hello

I’m searching for the more efficient way to update tables from CSV.

I have a snippet performing the following actions :

CSV download
update 2 tables
move CSV files to an archive folder after update

The snippet runs every day via a CRON task.

Here is the code part for updates.
With a CSV file < 1000 lines it works fine but with 15000 I get systematically a 504 timeout after 5 minutes.

I there a way to optimize a such process ?

<?php
if (is_array($files)) {

    $line_count = 0;

    foreach ($files as $file) {

        //loop files

        foreach (file($file) as $line) {

            $line_count++;

            //loop lines

            if ($line_count > 1) {
                //ignore first CSV line
                $proddispo_ar = explode(";", $line);
                $prod_refsupplier = $proddispo_ar[0];
                $prod_status = $proddispo_ar[1];

                /*******************
                 * 2 tables to update
                 * pss_product + pss_product_shop
                 */

                $query = $ps->newQuery('PssProduct');
                $query->where(array('supplier_reference' => $prod_refsupplier, 'id_supplier' => 118));
                $productrow_obj = $ps->getObject('PssProduct', $query);

                if ($productrow_obj) {

                    // update pss_product table
                    $id_product = $productrow_obj->get('id_product');
                    $productrow_obj->set('active', $prod_status);
                    $productrow_obj->save();

                    //update pss_product_shop table
                    $query_pssproductshop = $ps->newQuery('PssProductShop');
                    $query_pssproductshop->where(array('id_product' => $id_product, 'id_shop' => 65));
                    $pssproductshop_results = $ps->getCollection('PssProductShop', $query_pssproductshop);
                    foreach ($pssproductshop_results as $shop) {
                        $shop->set('active', $prod_status);
                        $shop->save();
                    }
                }
            }
        }
        $output .= "line_count: " . $line_count;

        $filename = basename($file);

        /*******************
         *Move file after update to /ARC (distant)
         */
        if ($sftp == false) {
            //en mode FTP
            ftp_rename($ftp_conn, $remoteDir . $filename, $remoteArcDir . $filename);
        } else {
            //en mode SFTP
            ssh2_sftp_rename($sftp_conn, $remoteDir . $filename, $remoteArcDir . $filename);
        }

        /*********************
         * Move file after update to /ARC (local)
         **/
        if ($debug == true) {
            $output .= '<br><br>déplacement du fichier local dans le dossier d\'archivage en cours...';
        }
        rename($dl_path . $remoteDir . $filename, $dl_path . $remoteArcDir . $filename);

    }

}

Maybe you can use set_time_limit in your code to avoid a timeout (or change max_execution_time in php.ini).


If you use PDO instead of xPDO, I assume you can make your code run faster.
E.g. instead of code like this

$query_pssproductshop = $ps->newQuery('PssProductShop');
$query_pssproductshop->where(array('id_product' => $id_product, 'id_shop' => 65));
$pssproductshop_results = $ps->getCollection('PssProductShop', $query_pssproductshop);
foreach ($pssproductshop_results as $shop) {
	$shop->set('active', $prod_status);
	$shop->save();
}

run a simple UPDATE SQL statement like this

$stmt = $modx->prepare("UPDATE `my_custom_table` SET `active` = ? WHERE id_shop = ? AND id_product = ?");
$stmt->execute([$prod_status, 65, $id_product]);

Also adding indexes to your custom database table may increase the execution time (if there’s a lot of data in the table).

If you can increase the PHP max execution time (or get your host to do it), it might solve the problem. It’s also possible that you’re hitting the upload_max_filesize limit, or the memory limit.

It might be worth figuring out whether the failure is occurring during the download or the DB update. Just comment out the DB update section. If you still get the error, it’s the download that’s breaking things.

Using PDO with a prepared query will speed things up a lot. Another thing to try would be to split the task into two separate cron jobs that run one after the other, and put the download in one and updating the DB in the other one.

There may also be ways to speed up the download if that’s where the error occurs. What does your download code look like?

If there’s any way to read x lines from the file and write a query that updates multiple records at once, rather than having a DB query for every record change, that might speed things up a lot.

Currently,
upload_max_filesize = 250M
max_execution_time = 7200
set_time_limit = no entry php info

It might be worth figuring out whether the failure is occurring during the download or the DB update. Just comment out the DB update section. If you still get the error, it’s the download that’s breaking things.

Here is the complete code:


$debug = true;
$notifTech = "XXXXXXX@XXXXXXXX.fr";


/******************************
 * Verifications
 * Clé présente et correcte
 * Des fichiers CSV existent
 * */
if(isset($_GET['k']) && $_GET['k'] == 'XXXXXXXXXXX') {
    $private = dirname(MODX_CORE_PATH);    
    $dl_path = $private . '/tools/workflow/fos';
    if($debug==true){ $output .= "<br> dl_path: " . $dl_path . "<br>"; }
    
    /******************
     * mode de connexion ftp
     * $sftp = false | true
     **/
    $sftp = true;
    
    if($sftp==false){
        /******************
         * connexion ftp
         **/
        $ftp_conn = ftp_connect("xxxxxxx.fr") or die("Cannot connect");
        ftp_login($ftp_conn, "xxxxxxx", "xxxxxxx") or die("Cannot login");
        ftp_pasv($ftp_conn, true) or die("Cannot change to passive mode");
        $files_remote = ftp_nlist($ftp_conn, "/EMI/*.txt");
        foreach ($files_remote as $file_remote){
            // download with ftp_get
            // download server file
            $local_file = $dl_path . $file_remote;
            //$local_file = $dl_path . "test.txt";
            if($debug==true){ 
                $output .= "Found file_remote: ' . $file_remote\n";
                $output .= "<br>local_file: " . $local_file;
            }    
            if (ftp_get($ftp_conn, $local_file, $file_remote, FTP_ASCII ))
              {
                if($debug==true){  $output .= "<br>Successfully written to $local_file."; }
              }
            else
              {
                if($debug==true){ $output .= "<br>Error downloading $file_remote."; }
              }
        }
    } else {
        /******************
        * connexion SFTP
        * https://www.php.net/manual/en/function.ssh2-sftp.php
        **/

        //preprod
        /*
        $host = 'XXXX-XXXX-XXXXX.XXXXXXX.fr';
        $username = 'XXXXXXX';
        $password = "XXXXXXXX";
        */
        
        $host = 'XXXXXXX';
        $username = 'XXXXXXX';
        $password = "XXXXXXX";
        $port = XXXXXXX;
        
        //dossier où les fichiers FT doivent être récupérés
        if($debug==true){$remoteDir = '/ftp/out/fd/';} else {$remoteDir = '/ftp/out/fd/';}
        //dossier d'archive
        if($debug==true){$remoteArcDir = '/ftp/arc/';} else {$remoteArcDir = '/ftp/arc/';}
        $localDir = $dl_path.'/'.$remoteDir.'/';
        
        //Gestion des imprevus
        if (!function_exists("ssh2_connect")){
            die('Function ssh2_connect does not exist.');
        }
         
        if (!$connection = ssh2_connect($host, $port)){
            die('Failed to connect.');
        }
         
        if (!ssh2_auth_password($connection, $username, $password)){
            die('Failed to authenticate.');
        }
         
        if (!$sftp_conn = ssh2_sftp($connection)){
            die('Failed to create a sftp connection.');
            
        }
         
        if (!$dir = opendir("ssh2.sftp://$sftp_conn$remoteDir")){
            die('Failed to open the directory.');
        }
         
        $files_remote = array();
        while ( ($file_remote = readdir($dir)) !== false)
        {
            if(substr($file_remote, -4)==".csv")
            {
                $files_remote[]=$file_remote;
            }
        }
        closedir($dir);
         
        foreach ($files_remote as $file_remote)
        {
            if($debug==true){ $output .= "Copying file: $file_remote\n"; }
            if (!$remote = fopen("ssh2.sftp://$sftp_conn$remoteDir$file_remote", 'r'))
            {
                if($debug==true){ $output .=  "Failed to open remote file: $file_remote\n"; }
                continue;
            }
         
            if (!$local = fopen($localDir . $file_remote, 'w'))
            {
                if($debug==true){ $output .= "Failed to create local file: $file_remote\n"; }
                continue;
            }
         
            $read = 0;
            $filesize = filesize("ssh2.sftp://$sftp_conn/$remoteDir$file_remote");
            while ( ($read < $filesize) && ($buffer = fread($remote, $filesize - $read)) )
            {
                $read += strlen($buffer);
                if (fwrite($local, $buffer) === FALSE)
                {
                    if($debug==true){ $output .= "Failed to write to local file: $file_remote\n"; }
                    break;
                }
            }
            fclose($local);
            fclose($remote);
        }
        
    }
    

    /************ Definition des chemins **********/
    $ftp_in = $dl_path . $remoteDir;
    
    // Chemin pour les fichiers traités a archiver (deplacement)
    //$ftp_out = '/ARC/';
    

    //Get CSV files
    $ext = "*.csv";//extension
    $files = glob($ftp_in . "*.csv");
    if($debug==true){
        $output .= "<br><strong>Dossier scanné :</strong> " . $ftp_in . '<br>';
    }
   /************ Verifi. si des fichiers **********/
    if (!$files) {
        if($debug==true){
            $output .= "<br><strong>Aucun fichier ". $ext." à traiter ! </strong><br>";
        } else {
            //die();
        }
    }
    
   
} else {
    // Invalid key - clé non présente ou incorrecte : renvoyer 401
   $modx->sendUnauthorizedPage();
}

$modx->invokeEvent('OnHandleRequest');
$ps_database_host     = $modx->getOption('sconf_psdb-host');
$ps_database_charset  = $modx->getOption('sconf_psdb-charset');
$ps_database_name     = $modx->getOption('sconf_psdb-name');
$ps_database_prefix   = $modx->getOption('sconf_psdb-prefix');
$ps_database_username = $modx->getOption('sconf_psdb-username');
$ps_database_password = $modx->getOption('sconf_psdb-pw');
$ps = new xPDO('mysql:host=' . $ps_database_host .
        ';dbname=' . $ps_database_name .
        ';charset=' . $ps_database_charset,
    $ps_database_username,
    $ps_database_password );
// text connex
//echo  $o = ($ps->connect()) ? 'Connected' : 'Not Connected';
if($debug==true){
    $output .= "********************** MODE DEBUF ACTIF *********************** <br>";
    $o = ($ps->connect()) ? 'Connected' : 'Not Connected';
    $output .= 'Test connexion base de données : ' . $o . '<br>';
}
// load package
$can_work_ps = $ps->addPackage('presta',MODX_CORE_PATH.'components/',$ps_database_prefix);
if (!$can_work_ps) die('Prestashop Package could not be loaded.');


if (is_array($files)) {
    
    $output .= "is_array files exists";
    $line_count = 0;
    
    foreach($files as $file) {
         //loop files, normaly only one file is downloaded
         foreach(file($file) as $line) {
           //loop CSV lines
            $line_count++;
            if($line_count > 1){
                //ignore csv header - ignorer l'entete du csv avec la fefinition des champs
                $proddispo_ar = explode(";", $line);
                $prod_refsupplier = $proddispo_ar[0];//normalement non prefixee, TODO: prevoir cas avec prefixe?
                $prod_status = $proddispo_ar[1];
                
                /*******************
                 * Update tables - 2 tables a mettre a jour : 
                 * pss_product + pss_product_shop
                 */
                $query = $ps->newQuery('PssProduct');
                $query->where(array('supplier_reference' => $prod_refsupplier, 'id_supplier' => 118));
                $productrow_obj = $ps->getObject('PssProduct', $query);
                if($productrow_obj){
                    //update pss_product table
                    $id_product = $productrow_obj->get('id_product');
                    $productrow_obj->set('active', $prod_status);
                    $productrow_obj->save();
                    //update pss_product_shop table
                    $query_pssproductshop = $ps->newQuery('PssProductShop');
                    $query_pssproductshop->where(array('id_product' => $id_product, 'id_shop' => 65));
                    $pssproductshop_results = $ps->getCollection('PssProductShop', $query_pssproductshop);
                    foreach($pssproductshop_results as $shop){
                        //$output .= '<br>id_shop: ' . $shop->get('id_shop') . ' active:' . $shop->get('active');
                        $shop->set('active', $prod_status);
                        $shop->save();
                    }
                }
            }
        }
        
        if($debug==true){ $output .= "line_count: " . $line_count; }
        
        
        //Move filesdeplacer les fichier dans /ARC (cas traitement sur FTP distant)
        $filename = basename($file);
        if($debug==true){
            $output .= '<br><br><strong>déplacement du fichier distant dans le dossier d\'archivage en cours......';
            $output .= '<br>from: ' . $remoteDir . $filename;
            $output .= '<br>to: ' . $remoteArcDir . $filename;
        }
        
        /*******************
         * Move files (remote FTP) Deplacement des fichiers distants dans le repertoire archives
         */
        if($sftp==false){
            //en mode FTP
            ftp_rename($ftp_conn, $remoteDir . $filename, $remoteArcDir . $filename);
        } else {
            //en mode SFTP
            ssh2_sftp_rename($sftp_conn, $remoteDir . $filename, $remoteArcDir . $filename);
        }
        
        
        /*********************
         * Move files (local) Deplacement fichier local : necessaire sinon sera traité à nouveau
         **/
        if($debug==true){
             $output .= '<br><br>déplacement du fichier local dans le dossier d\'archivage en cours...';
        }
        rename($dl_path . $remoteDir . $filename, $dl_path . $remoteArcDir . $filename);
         
    }

}


if($debug==true){
    return $output;
}
type or paste code here

You can speed up the DB part a lot by creating a parameterized, prepared query statement outside your loop, binding your variables to terms of the statement and calling just execute() inside your loop.

Note that the example at that link is not calling newQuery(). MODX already has an instance of PDO on tap, so $modx->prepare() just calls $modx->pdo->prepare(), and that will only happen once at the top of your code.

With a prepared query, the query is created just once. You’re creating three brand-new queries for each line of the file, and since you’re doing it through MODX, it involves extra parsing and permission and policy checks for every one of them.

For the file download, some native FTP services can be very slow. Here’s a utility for downloading large files you might try.

Hi,
Thank you for your reply, it seems very interesting, I’ll try it next.

I have updated my code but nothing happens (no update).
I’m not sure about this part:

    $table_pss_product = $ps->getTableName('pssProduct');//className
    $updateStatement = $ps->prepare("UPDATE " . $table_pss_product . "
    SET `active`=:active
        WHERE `supplier_reference`=:supplier_reference");
    $updateStatement->bindParam(':active', $active, PDO::PARAM_INT);//$active is int (tinyint(1))
    $updateStatement->bindParam(':supplier_reference', $supplier_reference);//string (varchar(32) )

The table is in a remote database. I don’t know if getTableName can works outside MODX.

The complete code:

<?php


$debug = true;
$notifTech = "xxx@xxxx.fr";
$movefiles = false;//pour faciliter les tests

/******************************
 * Verifications
 * Clé présente et correcte
 * Des fichiers XML existent
 * */
if(isset($_GET['k']) && $_GET['k'] == 'xxxxxxx') {
    //continuer
    $private = dirname(MODX_CORE_PATH);///var/www/clients/client1/web118/private    
    $dl_path = $private . '/tools/workflow/fos';
    if($debug==true){ $output .= "<br> dl_path: " . $dl_path . "<br>"; }
    
    /******************
     * mode de connexion ftp
     * $sftp = false | true
     **/
    $sftp = true;
    
    if($sftp==false){
        /******************
         * connexion ftp
         **/
        $ftp_conn = ftp_connect("xxxxx.fr") or die("Cannot connect");
        ftp_login($ftp_conn, "xxxxxx", "xxxxxx") or die("Cannot login");
        ftp_pasv($ftp_conn, true) or die("Cannot change to passive mode");
        $files_remote = ftp_nlist($ftp_conn, "/EMI/*.txt");
        foreach ($files_remote as $file_remote){
            // download with ftp_get
            // download server file
            $local_file = $dl_path . $file_remote;
            //$local_file = $dl_path . "test.txt";
            if($debug==true){ 
                //echo "echo";
                $output .= "Found file_remote: ' . $file_remote\n";
                $output .= "<br>local_file: " . $local_file;
            }    
            if (ftp_get($ftp_conn, $local_file, $file_remote, FTP_ASCII ))
              {
                if($debug==true){  $output .= "<br>Successfully written to $local_file."; }
              }
            else
              {
                if($debug==true){ $output .= "<br>Error downloading $file_remote."; }
              }
        }
    } else {
        /******************
        * connexion SFTP
        * https://www.php.net/manual/en/function.ssh2-sftp.php
        * https://randomdrake.com/2012/02/08/listing-and-downloading-files-over-sftp-with-php-and-ssh2/
        **/

        //preprod
        $host = 'xxx-xxxx-xxxx.xxxx.fr';
        $username = 'xxxx';
        $password = "xxxxx";
        $port = 22;

        //dossier où les fichiers FT doivent être récupérés
        if($debug==true){$remoteDir = '/ftp/out/fd/';} else {$remoteDir = '/ftp/out/fd/';}
        //dossier d'archive
        if($debug==true){$remoteArcDir = '/ftp/arc/';} else {$remoteArcDir = '/ftp/arc/';}
        $localDir = $dl_path.'/'.$remoteDir.'/';
        
        //Gestion des imprevus
        if (!function_exists("ssh2_connect")){
            die('Function ssh2_connect does not exist.');
        }
         
        if (!$connection = ssh2_connect($host, $port)){
            die('Failed to connect.');
        }
         
        if (!ssh2_auth_password($connection, $username, $password)){
            die('Failed to authenticate.');
        }
         
        if (!$sftp_conn = ssh2_sftp($connection)){
            die('Failed to create a sftp connection.');
            
        }
         
        if (!$dir = opendir("ssh2.sftp://$sftp_conn$remoteDir")){
            die('Failed to open the directory.');
        }
         
        $files_remote = array();
        while ( ($file_remote = readdir($dir)) !== false)
        {
            if(substr($file_remote, -4)==".csv")
            {
                $files_remote[]=$file_remote;
            }
        }
        closedir($dir);
         
        foreach ($files_remote as $file_remote)
        {
            if($debug==true){ $output .= "Copying file: $file_remote\n"; }
            if (!$remote = fopen("ssh2.sftp://$sftp_conn$remoteDir$file_remote", 'r'))
            {
                if($debug==true){ $output .=  "Failed to open remote file: $file_remote\n"; }
                continue;
            }
         
            if (!$local = fopen($localDir . $file_remote, 'w'))
            {
                if($debug==true){ $output .= "Failed to create local file: $file_remote\n"; }
                continue;
            }
         
            $read = 0;
            $filesize = filesize("ssh2.sftp://$sftp_conn/$remoteDir$file_remote");
            while ( ($read < $filesize) && ($buffer = fread($remote, $filesize - $read)) )
            {
                $read += strlen($buffer);
                if (fwrite($local, $buffer) === FALSE)
                {
                    if($debug==true){ $output .= "Failed to write to local file: $file_remote\n"; }
                    break;
                }
            }
            fclose($local);
            fclose($remote);
        }
        
    }
    

    /************ Definition des chemins **********/
    
    /* cas distant : recuperetation de fichies sur un serveur  */
    //$ftp_path = dirname(MODX_CORE_PATH) . '/tools/sandbox/sandbox_ftp1/';
    //$ftp_path = $dl_path;
    // Chemin FTP dédié pour la réception ds fichiers XML
    
    //$ftp_in = $dl_path . '/EMI/';
    $ftp_in = $dl_path . $remoteDir;
    
    // Chemin pour les fichiers traités a archiver (deplacement)
    //$ftp_out = '/ARC/';
    

    //Get XML files
    $ext = "*.csv";//extension
    $files = glob($ftp_in . "*.csv");
    if($debug==true){
        $output .= "<br><strong>Dossier scanné :</strong> " . $ftp_in . '<br>';
    }
   /************ Verifi. si des fichiers XML existent **********/
    if (!$files) {
        if($debug==true){
            $output .= "<br><strong>Aucun fichier ". $ext." à traiter ! </strong><br>";
        } else {
            //die();
        }
    }
    
   
} else {
    // clé non présente ou incorrecte : renvoyer 401
   $modx->sendUnauthorizedPage();
   /*
    header('HTTP/1.1 401 Unauthorized');
    @session_write_close();
    exit(0);
    */
}

$modx->invokeEvent('OnHandleRequest');
$ps_database_host     = $modx->getOption('sconf_psdb-host');
$ps_database_charset  = $modx->getOption('sconf_psdb-charset');
$ps_database_name     = $modx->getOption('sconf_psdb-name');
$ps_database_prefix   = $modx->getOption('sconf_psdb-prefix');
$ps_database_username = $modx->getOption('sconf_psdb-username');
$ps_database_password = $modx->getOption('sconf_psdb-pw');
$ps = new xPDO('mysql:host=' . $ps_database_host .
        ';dbname=' . $ps_database_name .
        ';charset=' . $ps_database_charset,
    $ps_database_username,
    $ps_database_password );
// text connex
//echo  $o = ($ps->connect()) ? 'Connected' : 'Not Connected';
if($debug==true){
    echo "echo";
    $output .= "********************** MODE DEBUF ACTIF *********************** <br>";
    $o = ($ps->connect()) ? 'Connected' : 'Not Connected';
    $output .= 'Test connexion base de données : ' . $o . '<br>';
}
// load package
$can_work_ps = $ps->addPackage('presta',MODX_CORE_PATH.'components/',$ps_database_prefix);
if (!$can_work_ps) die('Prestashop Package could not be loaded.');




if (is_array($files)) {
    
    /*******************************************************
    * TEST parameterized, prepared query statement
    * https://bobsguides.com/blog.html/2014/12/17/using-parameterized-prepared-statements-to-retrieve-data/
    */
    $table_pss_product = $ps->getTableName('pssProduct');//className
    $updateStatement = $ps->prepare("UPDATE " . $table_pss_product . "
    SET `active`=:active
        WHERE `supplier_reference`=:supplier_reference");
    $updateStatement->bindParam(':active', $active, PDO::PARAM_INT);//$active is int (tinyint(1))
    $updateStatement->bindParam(':supplier_reference', $supplier_reference);//string (varchar(32) )

    $output .= "is_array files exists";
    
    $line_count = 0;
    
    foreach($files as $file) {
         //loop files
         
        // Read a CSV file
        $handle = fopen($file, "r");
        
        // Optionally, you can keep the number of the line where
        // the loop its currently iterating over
        $lineNumber = 1;
        
        // Iterate over every line of the file
        while (($raw_string = fgets($handle)) !== false) {
            
            if($lineNumber > 1){
                // Parse the raw csv string: "1, a, b, c"
                $row = str_getcsv($raw_string);
            
                // into an array: ['1', 'a', 'b', 'c']
                // And do what you need to do with every line
                //var_dump($row); 
                //print_r($row);
                $line=$row[0];
                $proddispo_ar = explode(";", $line);
                $prod_refsupplier = $proddispo_ar[0];//normalement non prefixee, TODO: prevoir cas avec prefixe?
                $prod_status = $proddispo_ar[1];
                
                $output .= '<br>ref fournisseur: ' . $prod_refsupplier;
                $output .= ' prod_status: ' . $prod_status . '<br>';

                //new https://bobsguides.com/blog.html/2014/12/17/using-parameterized-prepared-statements-to-retrieve-data/
                $supplier_reference = $proddispo_ar[0];  
                $active = $proddispo_ar[1];
                 
                // Execute update           
                $updateStatement->execute();
            }

            // Increase the current line
            $lineNumber++;
        }
        
        fclose($handle);
        //print_r($rows_ar);//Array ( [id_shop] => 40 ) Array ( [id_shop] => 41 )
        //die("333");//ok works until here

        $output .= "line_count: " . $lineNumber;
        //anciennes position

        //deplacer les fichier dans /ARC (cas traitement sur FTP distant)
        $filename = basename($file);
        
        
        if($debug==true){
             $output .= '<br><br><strong>déplacement du fichier distant dans le dossier d\'archivage en cours......';
             $output .= '<br>from: ' . $remoteDir . $filename;
             $output .= '<br>to: ' . $remoteArcDir . $filename;
        }
        
        
        if($movefiles==true){
            /*******************
             * Deplacement des fichiers distants dans le repertoire archives
             */
            if($sftp==false){
                //en mode FTP
                ftp_rename($ftp_conn, $remoteDir . $filename, $remoteArcDir . $filename);
            } else {
                //en mode SFTP
                ssh2_sftp_rename($sftp_conn, $remoteDir . $filename, $remoteArcDir . $filename);
                
                //ssh2_sftp_unlink($sftp_conn, $remoteDir . $filename, $remoteArcDir . $filename);
            }
            
            
            /*********************
             * Deplacement fichier local : necessaire sinon sera traité à nouveau
             **/
            if($debug==true){
                 $output .= '<br><br>déplacement du fichier local dans le dossier d\'archivage en cours...';
            }
            rename($dl_path . $remoteDir . $filename, $dl_path . $remoteArcDir . $filename);
            
        }

     }

}


if($debug==true){
    return $output;
}

getTableName() is strictly for the MODX db. You’d have to modify that section to talk to the remote DB.

Thank you very much for your reply.
I tested

    $table_pss_product = "pss_product";
     
    $updateStatement = $ps->prepare("UPDATE " . $table_pss_product . "
    SET `active`=:active
        WHERE `supplier_reference`=:supplier_reference");
    $updateStatement->bindParam(':active', $active, PDO::PARAM_INT);//$active is int (tinyint(1))
    $updateStatement->bindParam(':supplier_reference', $supplier_reference);//string (varchar(32) )

It works fine on a 1500 lines CSV, i ok

WIth a 3200 lines CSV, again a timeout at 5 minutes

Can you tell what kind of timeout you’re getting? Is it a PHP timeout, or is the remote server timing out?

There’s some code on the internet for using cURL to download large files. Here’s an example.

It’s a 504 Gateway Timeout.
The CSV file with 3000 lines is not specially “large” (76 Kb)

Is it important that the CRON job runs just once per day?

If not, could you maybe limit the snippet to process just, say, 500 lines at a time and add a routine to remove those processed lines from the CSV file?

You could run that CRON job frequently throughout the day.

Cron jobs don’t typically have 504 time-outs, are you actually running it through the cron scheduler or something else?

You may be running into protection from bots and scrapers, or a DDOS attack if you’re making too many requests in a short time.

You could try putting some sleep time between file downloads.

For testing I run the routine ‘manually’ without CRON job, with one 3000 line CSV file.
The problem is the same when I run the script with Curl from the server (Gateway timeout at 300 seconds)

The only one value at 300 in php info is user_ini.cache_ttl

Try running it from the commandline (i.e. php path/to/script.php) instead of through the webserver (in browser or with curl, which also goes through the webserver). That should work better.

300 seconds seems to be a default time out value to prevent DDOS attacks on several systems. As MODX connect to a remote database it could be that. I’ll contact my server provider.

I don’t know if timings are normal for the updates, here is what I get with a 1000 lines CSV:

1000 lines with update on one shop => 1000 updates
Modx timings
Query Time 0.2324 s
Query Count 111
Parse Time 117.5468 s
Total Time 117.7791 s
Source database
Memory Usage 2 048 kb

stopwatch : 2 min
browser console: 1.97 min