L3A composite product - nothing is happening


#1

Good afternoon,

After having downloaded 2 months of data and having the L2A products, I have not been able to have sen2agri generating a L3A Composite. The scheduled product creation is still on hold and when I try to manually products one, the job gets scheduled but nothing is happening.

Can you help?

Thank you


#2

Hello,

Could you please check the status (and eventually the logs during the launching of the L3A Composite) for the following services:

  • sudo journalctl -fu sen2agri-orchestrator
  • sudo journalctl -fu sen2agri-executor
  • sudo journalctl -fu sen2agri-scheduler
    I assume that the site and season are still enabled.
    Another check would be to execute the following:
  • log in as user “sen2agri-service” : sudo su -l sen2agri-service
  • Execute: “srun ls”. Does it executes OK?

Also, could you tell us if you executed a “scheduled job” from dashboard or you executed it from “Custom Jobs”?

Best regards,
Cosmin


#5

Hi,

Here it goes:
Could you please check the status (and eventually the logs during the launching of the L3A Composite) for the following services:

sudo journalctl -fu sen2agri-orchestrator

[jonaszed@localhost ~]$ sudo journalctl -fu sen2agri-orchestrator
– Logs begin at Wed 2019-02-27 18:41:43 EST. –
Mar 02 14:41:48 localhost.localdomain sen2agri-orchestrator[7439]: IsInSeason: Date not in season (start = Thu Sep 10 2015, end = Sun Jul 10 2016, current=Wed Aug 16 2017)
Mar 02 14:41:48 localhost.localdomain sen2agri-orchestrator[7439]: IsInSeason: Date not in season (start = Sat Sep 10 2016, end = Mon Jul 10 2017, current=Wed Aug 16 2017)
Mar 02 14:41:48 localhost.localdomain sen2agri-orchestrator[7439]: IsInSeason: Date not in season (start = Sun Sep 10 2017, end = Tue Jul 10 2018, current=Wed Aug 16 2017)
Mar 02 14:41:48 localhost.localdomain sen2agri-orchestrator[7439]: IsInSeason: Date not in season (start = Mon Sep 10 2018, end = Wed Jul 10 2019, current=Wed Aug 16 2017)
Mar 02 14:41:48 localhost.localdomain sen2agri-orchestrator[7439]: Scheduler L3B: Error getting season start dates for site 1 for scheduled date Wed Aug 16 00:00:00 2017!
Mar 02 14:41:52 localhost.localdomain sen2agri-orchestrator[7439]: GetNewEvents took 3 ms
Mar 02 14:42:02 localhost.localdomain sen2agri-orchestrator[7439]: GetNewEvents took 3 ms
Mar 02 14:42:12 localhost.localdomain sen2agri-orchestrator[7439]: GetNewEvents took 3 ms
Mar 02 14:42:22 localhost.localdomain sen2agri-orchestrator[7439]: GetNewEvents took 1 ms
Mar 02 14:42:32 localhost.localdomain sen2agri-orchestrator[7439]: GetNewEvents took 1 ms
Mar 02 14:42:42 localhost.localdomain sen2agri-orchestrator[7439]: GetNewEvents took 3 ms
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: GetSiteSeasons took 4 ms
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: GetConfigurationParameters took 6 ms
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: GetSiteDescriptions took 3 ms
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: GetProducts took 7 ms
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: Scheduled job for L4A and site ID 1 with start date Sat Sep 10 00:00:00 2016 and end date Wed Mar 15 00:00:00 2017 will not be executed (no products)!
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: GetSiteSeasons took 4 ms
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: GetConfigurationParameters took 5 ms
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: GetSiteDescriptions took 4 ms
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: GetConfigurationParameters took 5 ms
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: Scheduler CropType: Error no shapefile found for site 1!
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: GetSiteSeasons took 4 ms
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: GetConfigurationParameters took 5 ms
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: GetSiteDescriptions took 3 ms
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: GetConfigurationParameters took 5 ms
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: Scheduler CropType: Error no shapefile found for site 1!
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: GetConfigurationParameters took 6 ms
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: GetSiteSeasons took 4 ms
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: IsInSeason: Date not in season (start = Thu Sep 10 2015, end = Sun Jul 10 2016, current=Mon Aug 15 2016)
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: IsInSeason: Date not in season (start = Thu Sep 10 2015, end = Sun Jul 10 2016, current=Mon Aug 15 2016)
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: IsInSeason: Date not in season (start = Sat Sep 10 2016, end = Mon Jul 10 2017, current=Mon Aug 15 2016)
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: IsInSeason: Date not in season (start = Sun Sep 10 2017, end = Tue Jul 10 2018, current=Mon Aug 15 2016)
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: IsInSeason: Date not in season (start = Mon Sep 10 2018, end = Wed Jul 10 2019, current=Mon Aug 15 2016)
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: Scheduler L3B: Error getting season start dates for site 1 for scheduled date Mon Aug 15 00:00:00 2016!
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: GetSiteSeasons took 4 ms
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: GetConfigurationParameters took 6 ms
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: GetSiteDescriptions took 3 ms
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: GetProducts took 7 ms
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: Scheduled job for L4A and site ID 1 with start date Thu Sep 10 00:00:00 2015 and end date Tue Mar 15 00:00:00 2016 will not be executed (no products)!
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: Scheduler L3A: Getting season dates for site 1 for scheduled date Wed Sep 30 00:00:00 2015!
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: GetSiteSeasons took 4 ms
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: Scheduler L3A: Extracted season dates: Start: Thu Sep 10 00:00:00 2015, End: Sun Jul 10 00:00:00 2016!
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: GetConfigurationParameters took 6 ms
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: GetProducts took 7 ms
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: Scheduled job for L3A and site ID 1 with start date Thu Sep 10 00:00:00 2015 and end date Fri Sep 25 00:00:00 2015 will not be executed (no products)!
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: GetConfigurationParameters took 6 ms
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: GetSiteSeasons took 4 ms
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: IsInSeason: Date not in season (start = Sat Sep 10 2016, end = Mon Jul 10 2017, current=Wed Aug 16 2017)
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: IsInSeason: Date not in season (start = Thu Sep 10 2015, end = Sun Jul 10 2016, current=Wed Aug 16 2017)
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: IsInSeason: Date not in season (start = Sat Sep 10 2016, end = Mon Jul 10 2017, current=Wed Aug 16 2017)
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: IsInSeason: Date not in season (start = Sun Sep 10 2017, end = Tue Jul 10 2018, current=Wed Aug 16 2017)
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: IsInSeason: Date not in season (start = Mon Sep 10 2018, end = Wed Jul 10 2019, current=Wed Aug 16 2017)
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: Scheduler L3B: Error getting season start dates for site 1 for scheduled date Wed Aug 16 00:00:00 2017!
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: GetSiteSeasons took 4 ms
Mar 02 14:42:48 localhost.localdomain sen2agri-orchestrator[7439]: GetConfigurationParameters took 5 ms
Mar 02 14:42:49 localhost.localdomain sen2agri-orchestrator[7439]: GetSiteDescriptions took 3 ms
Mar 02 14:42:49 localhost.localdomain sen2agri-orchestrator[7439]: GetConfigurationParameters took 5 ms
Mar 02 14:42:49 localhost.localdomain sen2agri-orchestrator[7439]: Scheduler CropType: Error no shapefile found for site 1!
Mar 02 14:42:49 localhost.localdomain sen2agri-orchestrator[7439]: Scheduler L3A: Getting season dates for site 1 for scheduled date Fri Sep 30 00:00:00 2016!
Mar 02 14:42:49 localhost.localdomain sen2agri-orchestrator[7439]: GetSiteSeasons took 4 ms
Mar 02 14:42:49 localhost.localdomain sen2agri-orchestrator[7439]: Scheduler L3A: Extracted season dates: Start: Sat Sep 10 00:00:00 2016, End: Mon Jul 10 00:00:00 2017!
Mar 02 14:42:49 localhost.localdomain sen2agri-orchestrator[7439]: GetConfigurationParameters took 5 ms
Mar 02 14:42:49 localhost.localdomain sen2agri-orchestrator[7439]: GetProducts took 7 ms
Mar 02 14:42:49 localhost.localdomain sen2agri-orchestrator[7439]: Scheduled job for L3A and site ID 1 with start date Sat Sep 10 00:00:00 2016 and end date Sun Sep 25 00:00:00 2016 will not be executed (no products)!
Mar 02 14:42:49 localhost.localdomain sen2agri-orchestrator[7439]: Scheduler L3A: Getting season dates for site 1 for scheduled date Mon Apr 30 00:00:00 2018!
Mar 02 14:42:49 localhost.localdomain sen2agri-orchestrator[7439]: GetSiteSeasons took 4 ms
Mar 02 14:42:49 localhost.localdomain sen2agri-orchestrator[7439]: Scheduler L3A: Extracted season dates: Start: Sun Sep 10 00:00:00 2017, End: Tue Jul 10 00:00:00 2018!
Mar 02 14:42:49 localhost.localdomain sen2agri-orchestrator[7439]: GetConfigurationParameters took 6 ms
Mar 02 14:42:49 localhost.localdomain sen2agri-orchestrator[7439]: GetProducts took 8 ms
Mar 02 14:42:49 localhost.localdomain sen2agri-orchestrator[7439]: Scheduled job for L3A and site ID 1 with start date Tue Mar 6 00:00:00 2018 and end date Wed Apr 25 00:00:00 2018 will not be executed (no products)!
Mar 02 14:42:49 localhost.localdomain sen2agri-orchestrator[7439]: GetSiteSeasons took 4 ms
Mar 02 14:42:49 localhost.localdomain sen2agri-orchestrator[7439]: IsInSeason: Date not in season (start = Sun Sep 10 2017, end = Tue Jul 10 2018, current=Fri Aug 31 2018)
Mar 02 14:42:49 localhost.localdomain sen2agri-orchestrator[7439]: IsInSeason: Date not in season (start = Thu Sep 10 2015, end = Sun Jul 10 2016, current=Fri Aug 31 2018)
Mar 02 14:42:49 localhost.localdomain sen2agri-orchestrator[7439]: IsInSeason: Date not in season (start = Sat Sep 10 2016, end = Mon Jul 10 2017, current=Fri Aug 31 2018)
Mar 02 14:42:49 localhost.localdomain sen2agri-orchestrator[7439]: IsInSeason: Date not in season (start = Sun Sep 10 2017, end = Tue Jul 10 2018, current=Fri Aug 31 2018)
Mar 02 14:42:49 localhost.localdomain sen2agri-orchestrator[7439]: IsInSeason: Date not in season (start = Mon Sep 10 2018, end = Wed Jul 10 2019, current=Fri Aug 31 2018)
Mar 02 14:42:49 localhost.localdomain sen2agri-orchestrator[7439]: Scheduler CropMask: Error getting season start dates for site 1 for scheduled date Fri Aug 31 00:00:00 2018!
Mar 02 14:42:49 localhost.localdomain sen2agri-orchestrator[7439]: GetSiteSeasons took 4 ms
Mar 02 14:42:49 localhost.localdomain sen2agri-orchestrator[7439]: GetConfigurationParameters took 6 ms
Mar 02 14:42:49 localhost.localdomain sen2agri-orchestrator[7439]: GetSiteDescriptions took 3 ms
Mar 02 14:42:49 localhost.localdomain sen2agri-orchestrator[7439]: GetConfigurationParameters took 6 ms
Mar 02 14:42:49 localhost.localdomain sen2agri-orchestrator[7439]: Scheduler CropType: Error no shapefile found for site 1!
Mar 02 14:42:49 localhost.localdomain sen2agri-orchestrator[7439]: GetConfigurationParameters took 6 ms
Mar 02 14:42:49 localhost.localdomain sen2agri-orchestrator[7439]: GetSiteSeasons took 4 ms
Mar 02 14:42:49 localhost.localdomain sen2agri-orchestrator[7439]: IsInSeason: Date not in season (start = Sun Sep 10 2017, end = Tue Jul 10 2018, current=Thu Aug 16 2018)
Mar 02 14:42:49 localhost.localdomain sen2agri-orchestrator[7439]: IsInSeason: Date not in season (start = Thu Sep 10 2015, end = Sun Jul 10 2016, current=Thu Aug 16 2018)
Mar 02 14:42:49 localhost.localdomain sen2agri-orchestrator[7439]: IsInSeason: Date not in season (start = Sat Sep 10 2016, end = Mon Jul 10 2017, current=Thu Aug 16 2018)
Mar 02 14:42:49 localhost.localdomain sen2agri-orchestrator[7439]: IsInSeason: Date not in season (start = Sun Sep 10 2017, end = Tue Jul 10 2018, current=Thu Aug 16 2018)
Mar 02 14:42:49 localhost.localdomain sen2agri-orchestrator[7439]: IsInSeason: Date not in season (start = Mon Sep 10 2018, end = Wed Jul 10 2019, current=Thu Aug 16 2018)
Mar 02 14:42:49 localhost.localdomain sen2agri-orchestrator[7439]: Scheduler L3B: Error getting season start dates for site 1 for scheduled date Thu Aug 16 00:00:00 2018!
Mar 02 14:42:52 localhost.localdomain sen2agri-orchestrator[7439]: GetNewEvents took 3 ms
Mar 02 14:43:02 localhost.localdomain sen2agri-orchestrator[7439]: GetNewEvents took 3 ms
Mar 02 14:43:12 localhost.localdomain sen2agri-orchestrator[7439]: GetNewEvents took 3 ms
Mar 02 14:43:22 localhost.localdomain sen2agri-orchestrator[7439]: GetNewEvents took 3 ms
Mar 02 14:43:32 localhost.localdomain sen2agri-orchestrator[7439]: GetNewEvents took 3 ms
Mar 02 14:43:42 localhost.localdomain sen2agri-orchestrator[7439]: GetNewEvents took 3 ms

sudo journalctl -fu sen2agri-executor

^C[jonaszed@localhost ~]$ sudo journalctl -fu sen2agri-executor
– Logs begin at Wed 2019-02-27 18:41:43 EST. –
Feb 28 00:01:02 localhost.localdomain sen2agri-executor[668]: MarkStepPendingStart took 2 ms
Mar 01 00:01:14 localhost.localdomain sen2agri-executor[668]: GetProcessorDescriptions took 3 ms
Mar 01 00:01:14 localhost.localdomain sen2agri-executor[668]: GetConfigurationParameters took 5 ms
Mar 01 00:01:14 localhost.localdomain sen2agri-executor[668]: GetProcessorDescriptions took 3 ms
Mar 01 00:01:14 localhost.localdomain sen2agri-executor[668]: GetConfigurationParameters took 5 ms
Mar 01 00:01:14 localhost.localdomain sen2agri-executor[668]: HandleStartProcessor: Executing command srun with params --qos qoslai --job-name TSKID_27683_STEPNAME_BVInputVariableGeneration_0 /usr/bin/sen2agri-processor-wrapper SRV_IP_ADDR=127.0.0.1 SRV_PORT_NO=7777 WRP_SEND_RETRIES_NO=3600 WRP_TIMEOUT_BETWEEN_RETRIES=1000 WRP_EXECUTES_LOCAL=1 JOB_NAME=TSKID_27683_STEPNAME_BVInputVariableGeneration_0 PROC_PATH=/usr/bin/otbcli PROC_PARAMS BVInputVariableGeneration -samples 40000 -out /mnt/archive/orchestrator_temp/l3b/128/27683-lai-bv-input-variable-generation/out_bv_dist_samples.txt -minlai 0.0 -maxlai 5.0 -modlai 0.5 -stdlai 1.0 -minala 5.0 -maxala 80.0 -modala 40.0 -stdala 20.0
Mar 01 00:01:14 localhost.localdomain sen2agri-executor[668]: HandleStartProcessor: Executing command sbatch with params --job-name TSKID_27683_STEPNAME_BVInputVariableGeneration_0 --qos qoslai /tmp/sen2agri-executor.vrA668
Mar 01 00:01:14 localhost.localdomain sen2agri-executor[668]: HandleStartProcessor: Sbatch command returned: "Submitted batch job 202
Mar 01 00:01:14 localhost.localdomain sen2agri-executor[668]: "
Mar 01 00:01:14 localhost.localdomain sen2agri-executor[668]: MarkStepPendingStart took 6 ms

sudo journalctl -fu sen2agri-scheduler

C[jonaszed@localhost ~]$ sudo journalctl -fu sen2agri-scheduler
– Logs begin at Wed 2019-02-27 18:41:43 EST. –
Mar 02 14:51:48 localhost.localdomain sen2agri-scheduler[5372]: UpdateScheduledTasksStatus took 33 ms
Mar 02 14:51:48 localhost.localdomain sen2agri-scheduler[5372]: The job for processor: 5, siteId: 1 cannot be started now as is invalid
Mar 02 14:51:48 localhost.localdomain sen2agri-scheduler[5372]: The job for processor: 3, siteId: 1 cannot be started now as is invalid
Mar 02 14:51:48 localhost.localdomain sen2agri-scheduler[5372]: The job for processor: 6, siteId: 1 cannot be started now as is invalid
Mar 02 14:51:48 localhost.localdomain sen2agri-scheduler[5372]: The job for processor: 2, siteId: 1 cannot be started now as is invalid
Mar 02 14:51:48 localhost.localdomain sen2agri-scheduler[5372]: The job for processor: 6, siteId: 1 cannot be started now as is invalid
Mar 02 14:51:48 localhost.localdomain sen2agri-scheduler[5372]: The job for processor: 6, siteId: 1 cannot be started now as is invalid
Mar 02 14:51:48 localhost.localdomain sen2agri-scheduler[5372]: The job for processor: 5, siteId: 1 cannot be started now as is invalid
Mar 02 14:51:48 localhost.localdomain sen2agri-scheduler[5372]: The job for processor: 3, siteId: 1 cannot be started now as is invalid
Mar 02 14:51:49 localhost.localdomain sen2agri-scheduler[5372]: UpdateScheduledTasksStatus took 26 ms
Mar 02 14:52:48 localhost.localdomain sen2agri-scheduler[5372]: GetScheduledTasks took 5 ms
Mar 02 14:52:48 localhost.localdomain sen2agri-scheduler[5372]: UpdateScheduledTasksStatus took 32 ms
Mar 02 14:52:48 localhost.localdomain sen2agri-scheduler[5372]: The job for processor: 5, siteId: 1 cannot be started now as is invalid
Mar 02 14:52:48 localhost.localdomain sen2agri-scheduler[5372]: The job for processor: 6, siteId: 1 cannot be started now as is invalid
Mar 02 14:52:48 localhost.localdomain sen2agri-scheduler[5372]: The job for processor: 6, siteId: 1 cannot be started now as is invalid
Mar 02 14:52:48 localhost.localdomain sen2agri-scheduler[5372]: The job for processor: 3, siteId: 1 cannot be started now as is invalid
Mar 02 14:52:48 localhost.localdomain sen2agri-scheduler[5372]: The job for processor: 5, siteId: 1 cannot be started now as is invalid
Mar 02 14:52:48 localhost.localdomain sen2agri-scheduler[5372]: The job for processor: 2, siteId: 1 cannot be started now as is invalid
Mar 02 14:52:48 localhost.localdomain sen2agri-scheduler[5372]: The job for processor: 3, siteId: 1 cannot be started now as is invalid
Mar 02 14:52:48 localhost.localdomain sen2agri-scheduler[5372]: The job for processor: 6, siteId: 1 cannot be started now as is invalid
Mar 02 14:52:49 localhost.localdomain sen2agri-scheduler[5372]: The job for processor: 2, siteId: 1 cannot be started now as is invalid
Mar 02 14:52:49 localhost.localdomain sen2agri-scheduler[5372]: The job for processor: 2, siteId: 1 cannot be started now as is invalid
Mar 02 14:52:49 localhost.localdomain sen2agri-scheduler[5372]: The job for processor: 5, siteId: 1 cannot be started now as is invalid
Mar 02 14:52:49 localhost.localdomain sen2agri-scheduler[5372]: The job for processor: 6, siteId: 1 cannot be started now as is invalid
Mar 02 14:52:49 localhost.localdomain sen2agri-scheduler[5372]: The job for processor: 3, siteId: 1 cannot be started now as is invalid
Mar 02 14:52:49 localhost.localdomain sen2agri-scheduler[5372]: UpdateScheduledTasksStatus took 24 ms

I assume that the site and season are still enabled.
Yes, the site exists with 4 seasons enabled, each of them corresponding to a yearly season.

Another check would be to execute the following:

log in as user “sen2agri-service” : sudo su -l sen2agri-service

Execute: “srun ls”. Does it executes OK?

The output is:

sen2agri-service@localhost ~]$ srun ls
srun: Required node not available (down, drained or reserved)
srun: job 204 queued and waiting for resources

Also, could you tell us if you executed a “scheduled job” from dashboard or you executed it from “Custom Jobs”?
I have scheduled monthly jobs (default) and as nothing was happening I have tried 4 custom jobs.

Best regards,

João


#6

Help anyone? I am really stuck…


#7

Hello,

It seems that your problem is not actually related to the processors but to SLURM.

srun: Required node not available (down, drained or reserved)
srun: job 204 queued and waiting for resources

Did you ran out of disk space on the system partition at some point (or maybe you still are running out of disk space on this partition)?
In this case, make sure you have enough disk space and then restart the SLURM services.
You can check the errors that you have with SLURM checking its log files: /var/log/slurm/slurm.log, /var/log/slurm/slurmd.log.
You can try restarting with systemctl the following services:

slurmd, slurmdbd, slurmctld and mariadb

After that you can try:

sudo -u sen2agri-service scontrol update NodeName=localhost State=RESUME

Hope this helps.

Best regards,
Cosmin


#8

Hi,

First and foremost I would like to thank you for your support.

Hello,

It seems that your problem is not actually related to the processors but to SLURM.

srun: Required node not available (down, drained or reserved)
srun: job 204 queued and waiting for resources

Did you ran out of disk space on the system partition at some point (or maybe you still are running out of disk space on this partition)?
In this case, make sure you have enough disk space and then restart the SLURM services.

I have actually run out of space a few weeks ago, I have increased the group volume when I have noticed it, currently I have still a few TB available.

You can check the errors that you have with SLURM checking its log files: /var/log/slurm/slurm.log, /var/log/slurm/slurmd.log.

Can I send the log files to you?

You can try restarting with systemctl the following services:

slurmd, slurmdbd, slurmctld and mariadb

I have restarted them using systemcl restart, they all appear as running

After that you can try:

sudo -u sen2agri-service scontrol update NodeName=localhost State=RESUME

Command executed without any errors.

How can I check if things are working_

Regards,

João


#9

Hello,

First, you can try again a

srun ls

under the sen2agri-service user account.
If this command is successful, you can check that the system is processing your jobs using :
journalctl -fu sen2agri-executor
The things should be moving here if everything is OK with SLURM.

Best regards,
Cosmin


#10

Working like a charm, thank you, you are a star