Understanding Job Ready States
A job becomes "ready" when:
- Current time ≥ scheduled run time
- All conditions satisfied
- Not already running or held
However, being ready doesn't guarantee immediate start. Several factors can block execution.
Common Reasons Jobs Wait
Load level constraints : CLOAD + job load > LOADLEVEL
Unsatisfied conditions : One or more conditions evaluate false
Remote variable unavailable : Critical condition references unavailable remote host
User load limit exceeded : User's total running jobs exceed their limit
Priority ordering : Higher priority jobs started first
Start rate limiting : STARTLIM/STARTWAIT delaying batch start
Diagnostic Approach
Step 1: Identify Waiting Job
bash
# List all jobs and their states btjlist # Focus on jobs that should be running btjlist | grep -v " Run " | grep -v " Held "
Note the job number of the waiting job.
Step 2: Check Job Details
bash
# Get comprehensive job information btjlist <job_number> # Or use btjstat for detailed status btjstat <job_number>
Look for:
- Time to run (should be in past)
- Conditions list
- Load level value
- Priority
- State/status
Step 3: Check Load Levels
bash
# View system load variables btvar -v LOADLEVEL CLOAD # Check job's load level btjlist <job_number> | grep "Load level"
Calculation:
If CLOAD + job_load > LOADLEVEL, job cannot start.
Example:
LOADLEVEL: 20000 CLOAD: 18500 Job load: 2000 18500 + 2000 = 20500 > 20000 ← Job blocked
Solution:
Wait for running jobs to complete (reducing CLOAD), or increase LOADLEVEL:
bash
btvar -s LOADLEVEL 25000
Step 4: Examine Conditions
bash
# View job conditions btjlist <job_number> # Look for "Conditions:" section
Each condition shows:
- Variable name
- Comparison operator
- Expected value
- Critical flag
Example condition:
STATUS = Ready
Check variable's actual value:
bash
btvar -v STATUS
If STATUS contains "Pending" instead of "Ready", condition not satisfied.
Step 5: Check Variable Values
For each condition on the job, verify variable value:
bash
# Check specific variable btvar -v <variable_name> # Check multiple variables btvlist | grep -E "var1|var2|var3"
For remote variables:
bash
# Remote variable format: machine:varname btvar -v remotemachine:STATUS
If remote machine unavailable and condition is critical, job blocked.
Step 6: Verify User Load Limits
bash
# Check user's current load usage
btjlist -u <username> | grep " Run "
# Count running jobs' total load
btjlist -u <username> | grep " Run " | awk '{sum+=$NF} END {print sum}'
Compare against user's load level limit (requires admin access to view):
bash
btuser -l <username> # Look for "Max total ll" (maximum total load level)
Resolving Common Blocking Scenarios
Scenario 1: Load Level Exceeded
Symptom:
bash
btvar -v LOADLEVEL CLOAD # LOADLEVEL: 20000 # CLOAD: 19500 # Job load: 1000 # 19500 + 1000 > 20000 - Job waits
Solutions:
Option A: Increase LOADLEVEL
bash
btvar -s LOADLEVEL 25000
Option B: Wait for jobs to complete
Monitor CLOAD:
bash
watch -n 5 'btvar -v CLOAD'
When CLOAD drops below 19000, job will start.
Option C: Reduce job's load level
bash
btjchange -l 500 <job_number>
Scenario 2: Condition Not Satisfied
Symptom:
bash
# Job condition: STATUS = Ready btvar -v STATUS # STATUS: Pending # Condition false, job waits
Solutions:
Option A: Set variable to required value
bash
btvar -s STATUS Ready
Job should start immediately (if no other blocking conditions).
Option B: Remove condition
If condition no longer relevant:
bash
btjchange <job_number> # In btq interface, edit conditions, delete the condition
Option C: Wait for another job to set variable
If variable should be set by another job's assignment:
bash
# Check which job sets STATUS btjlist | grep -i status # Look for jobs with assignments to STATUS variable
Scenario 3: Remote Variable Unavailable
Symptom:
bash
# Job condition: server2:BACKUP_DONE = Yes (critical) btvar -v server2:BACKUP_DONE # Error: Cannot connect to server2
Solutions:
Option A: Restore remote host connectivity
Investigate network or Xi-Batch connectivity to server2:
bash
# Test network connectivity ping server2 # Check Xi-Batch scheduler on server2 ssh server2 "ps aux | grep btsched"
Option B: Change condition to non-critical
If acceptable for job to start when remote unavailable:
bash
btjchange <job_number> # Edit condition, remove critical flag
Option C: Use local variable instead
Create local copy of variable:
bash
# Create local variable btvar -c BACKUP_DONE "Yes" # Change job to use local variable btjchange <job_number> # Edit condition: server2:BACKUP_DONE → BACKUP_DONE
Scenario 4: User Load Limit Exceeded
Symptom:
User already running many jobs, new job waits:
bash
# User jsmith running jobs btjlist -u jsmith | grep " Run " # Shows 5 large jobs already running # User's max total load level: 10000 # Already using: 9500 # New job load: 1000 # Would exceed limit
Solutions:
Option A: Wait for user's jobs to complete
Monitor user's current load:
bash
watch -n 10 'btjlist -u jsmith | grep " Run "'
Option B: Increase user's load limit (requires admin)
bash
btuser -u jsmith # Edit user, increase "Max total ll"
Option C: Reduce job's load level
bash
btjchange -l 200 <job_number>
Scenario 5: Priority Ordering
Symptom:
Lower priority job waiting while higher priority jobs start:
bash
# Your job priority: 100 # Other ready jobs priority: 150-200 # LOADLEVEL limit reached # Higher priority jobs start first
Solutions:
Option A: Increase job priority
bash
btjchange -p 200 <job_number>
Option B: Wait for higher priority jobs to complete
This is normal behavior - intentional priority ordering.
Option C: Reduce other jobs' priority (if you own them)
bash
btjchange -p 50 <other_job_number>
Systematic Troubleshooting Checklist
Use this checklist to diagnose waiting jobs:
1. Load Level Check
bash
btvar -v LOADLEVEL CLOAD btjlist <job_number> | grep "Load level" # Calculate: CLOAD + job_load vs LOADLEVEL
- Load levels allow job to start
2. Time Check
bash
btjlist <job_number> | grep "Time to run" # Compare against current time
- Run time is in the past
3. Conditions Check
bash
btjlist <job_number> # Review conditions list
For each condition:
- Variable exists
- Variable value satisfies condition
- Remote variables accessible (if critical)
4. User Limits Check
bash
btjlist -u <username> | grep " Run " # Sum load levels of running jobs
- User hasn't exceeded load limit
5. Priority Check
bash
btjlist | grep " Ready " | sort -k6 -rn # Shows ready jobs by priority
- No higher priority jobs waiting
6. Hold Status Check
bash
btjstat <job_number> | grep -i hold
- Job not on hold
Advanced Diagnostics
Trace Variable Changes
If condition involves variable that should change:
bash
# Enable variable logging btvar -s LOGVARS varlog # Watch variable changes tail -f /var/spool/xi/batch/varlog | grep <variable_name>
Monitor Job Transitions
Enable job logging to track state changes:
bash
# Enable job logging btvar -s LOGJOBS joblog # Watch job state changes tail -f /var/spool/xi/batch/joblog | grep <job_number>
Check Network Variables
For jobs with remote conditions:
bash
# List all remote variables in conditions
btjlist <job_number> | grep -E "machine:"
# Test each remote variable
for var in server1:VAR1 server2:VAR2; do
echo "Testing $var:"
btvar -v "$var"
done
Verification After Resolution
After making changes, verify job starts:
bash
# Watch job list watch -n 2 'btjlist <job_number>' # Should transition to "Run" state within moments
If still waiting, repeat diagnostic process.
Best Practices
Use informative variable names : Makes condition troubleshooting easier
Document job dependencies : Note which variables jobs depend on
Monitor variable changes : Enable LOGVARS for critical workflows
Set reasonable load levels : Avoid jobs with excessive load values
Use appropriate priorities : Reserve high priorities for truly critical jobs
Test conditions before deployment : Verify conditions work as expected
Provide fallback mechanisms : Use non-critical remote conditions when appropriate