Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fatal when submitting jobs #4

Open
mabahj opened this issue Apr 17, 2015 · 4 comments
Open

Fatal when submitting jobs #4

mabahj opened this issue Apr 17, 2015 · 4 comments

Comments

@mabahj
Copy link

mabahj commented Apr 17, 2015

I get a fatal (console below) when I try to submit jobs. This error does not contain any error output. Jenkins 1.599. PBS Plug-in 0.2. Master is running on Windows 7, slaves on linux. SGE grid. I am able to post a job to SGE manually if I copy and paste the command line shown in the console output. qsub accepts the command. But the job fails because the script created (/temp/jenkins/pbs/jenkinsPBS_2918185274526175465/script) does not have write permission.

Error message:

Created working directory '/temp/jenkins/pbs/jenkinsPBS_2918185274526175465' with permissions 'rwx------'
PBS script: /temp/jenkins/pbs/jenkinsPBS_2918185274526175465/script
FATAL: Failed to submit job script with command line 'qsub -e /temp/jenkins/pbs/jenkinsPBS_2918185274526175465/err -o /temp/jenkins/pbs/jenkinsPBS_2918185274526175465/out /temp/jenkins/pbs/jenkinsPBS_2918185274526175465/script'. Error output: 
ERROR: Failed to submit job script with command line 'qsub -e /temp/jenkins/pbs/jenkinsPBS_2918185274526175465/err -o /temp/jenkins/pbs/jenkinsPBS_2918185274526175465/out /temp/jenkins/pbs/jenkinsPBS_2918185274526175465/script'. Error output: 
Finished: FAILURE
@kinow
Copy link
Member

kinow commented Apr 17, 2015

Hmmm, tricky part will be to reproduce this issue. All I have for testing is a VirtualBox/Vagrant PBS Torque box. Are you aware of some way to reproduce this issue with an environment similar to yours?

@mabahj
Copy link
Author

mabahj commented Apr 17, 2015

Well. You could set up SGE, which is free. But I could not demand anything here. Another option could be to add some more logging output. I've enabled full logging in Jenkins and the only PBS entry I see is this:

apr 17, 2015 8:45:42 AM FINE hudson.remoting.Channel
Received UserRequest:jenkins.plugins.pbs.tasks.Qsub@293c71

If you add some output to the log, then it should be easier to see what happens?

Job config:

<?xml version="1.0" encoding="UTF-8"?>
<project>
  <actions/>
  <description>https://groups.google.com/forum/#!topic/biouno-users/fWBUIOiWjUg

http://biouno.org/jenkins-plugins.html

https://github.com/biouno/pbs-plugin/releases</description>
  <keepDependencies>false</keepDependencies>
  <properties>
    <hudson.plugins.throttleconcurrents.ThrottleJobProperty plugin="[email protected]">
      <maxConcurrentPerNode>0</maxConcurrentPerNode>
      <maxConcurrentTotal>0</maxConcurrentTotal>
      <categories>
        <string>slow_jobs</string>
      </categories>
      <throttleEnabled>false</throttleEnabled>
      <throttleOption>category</throttleOption>
    </hudson.plugins.throttleconcurrents.ThrottleJobProperty>
  </properties>
  <scm class="hudson.scm.NullSCM"/>
  <assignedNode>SGE</assignedNode>
  <canRoam>false</canRoam>
  <disabled>false</disabled>
  <blockBuildWhenDownstreamBuilding>false</blockBuildWhenDownstreamBuilding>
  <blockBuildWhenUpstreamBuilding>false</blockBuildWhenUpstreamBuilding>
  <triggers/>
  <concurrentBuild>false</concurrentBuild>
  <builders>
    <jenkins.plugins.pbs.PBSBuilder plugin="[email protected]">
      <script>#!/bin/bash
echo "=========================================="
echo "Sleeping on grid computer $(hostname)"
sleep 60
echo "Done"
echo "=========================================="</script>
    </jenkins.plugins.pbs.PBSBuilder>
  </builders>
  <publishers/>
  <buildWrappers/>
</project>

Node config:

<?xml version="1.0" encoding="UTF-8"?>
<jenkins.plugins.pbs.slaves.PBSSlave plugin="[email protected]">
  <name>SGE</name>
  <description>Son of Grid</description>
  <remoteFS>/work/jenkins/jenkins_test_grid_slave</remoteFS>
  <numExecutors>2</numExecutors>
  <mode>EXCLUSIVE</mode>
  <retentionStrategy class="hudson.slaves.RetentionStrategy$Always"/>
  <launcher class="hudson.plugins.sshslaves.SSHLauncher" plugin="[email protected]">
    <host>myhost</host>
    <port>22</port>
    <credentialsId>xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx</credentialsId>
    <maxNumRetries>0</maxNumRetries>
    <retryWaitTime>0</retryWaitTime>
  </launcher>
  <label/>
  <nodeProperties>
    <hudson.slaves.EnvironmentVariablesNodeProperty>
      <envVars serialization="custom">
        <unserializable-parents/>
        <tree-map>
          <default>
            <comparator class="hudson.util.CaseInsensitiveComparator"/>
          </default>
          <int>8</int>
          <string>GridEngRoot</string>
          <string>/cad/gnu/sge_test</string>
          <string>PATH</string>
          <string>/usr/bin:/usr/sbin:/bin:/usr/bin/X11:/usr/local/etc/jre/current/bin:/pri/jenkins/bin:/cad/gnu/sge_test/bin:/cad/gnu/sge_test/bin/lx-amd64</string>
          <string>SGE_ARCH</string>
          <string>lx-amd64</string>
          <string>SGE_CELL</string>
          <string>default</string>
          <string>SGE_CLUSTER_NAME</string>
          <string>sim1</string>
          <string>SGE_EXECD_PORT</string>
          <string>6445</string>
          <string>SGE_QMASTER_PORT</string>
          <string>6444</string>
          <string>SGE_ROOT</string>
          <string>/cad/gnu/sge_test</string>
        </tree-map>
      </envVars>
    </hudson.slaves.EnvironmentVariablesNodeProperty>
  </nodeProperties>
  <userId>jenkins</userId>
</jenkins.plugins.pbs.slaves.PBSSlave>

@kinow
Copy link
Member

kinow commented Jun 29, 2015

Note to self: test this docker image when debugging this issue https://registry.hub.docker.com/u/agaveapi/torque/

@kinow
Copy link
Member

kinow commented Jul 24, 2015

The docker image worked. Tried with a job configuration that comes with the container. Will try your job configuration. Probably while working on #9 I'll comment here what's wrong or how you could get your set up working.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants