Seeds Development With Eclipse

Embed Size (px)

Citation preview

  • 7/31/2019 Seeds Development With Eclipse

    1/16

    Seeds Development with Eclipse

    Jeremy Villalobos

    University of North Carolina at Charlotte

    Introduction

    This tutorial teaches how to create a Seeds project using Eclipse. The tutorial goes over creating aJava project and adding jars as well as javadoc to the project. Then it shows how to create a simpleworkpool template from the PGAFramework and deploy it to the Grid.

    Getting Eclipse

    Eclipse is an open source project you can get for free at http:// www.eclipse.org. Go to the website,click on Download and get "Eclipse IDE for Java Developers." The file is somewhat big, so it will takea few minutes depending on your network connection. Once you get the compressed file, you canunpack it using the pertinent archive program. The IDE does not have an installation wizard, so just putthe folder "eclipse" in some path on your file system. Examples can be c:/Program Files/ in Windowsand /usr/localin Linux. The IDE is started by calling the executable eclipse. In Linuxtype/usr/local/eclipse/eclipseon the command line. And in Windows typec:\ProgramFiles\eclipse\eclipse on the command line or just double click on the eclipse executable.

    Getting the PGAF Libraries

    Once Eclipse is loaded, select the workbench button (Figure 1). The button displays a tool tipsaying "Workbench" when you hover your mouse over it. There is plenty of documentation on gettingstarted at Eclipse's webiste. Refer to it for more information on particular Eclipse features. We'll moveon to setting up Eclipse to help us code PGAF programs. Eclipse provides help through its java docinformation and by detecting systax errors in real-time. It also generates code automatically forcommon tasks such as getter and setter functions, and it fills out required interface functions.

    http://www.eclipse.org/http://help.eclipse.org/ganymede/index.jsphttp://www.eclipse.org/http://help.eclipse.org/ganymede/index.jsphttp://www.eclipse.org/
  • 7/31/2019 Seeds Development With Eclipse

    2/16

  • 7/31/2019 Seeds Development With Eclipse

    3/16

    Creating a Project

    In this section we'll go through creating a new Java project, adding the library jars to it and thejavadoc.

    Create a new Java Project by going to File->New->JavaProject. Figure 2 shows the button sequenceon the screen.

    Name the project MySeedsProject and click the Finishbutton.Figure 2:

  • 7/31/2019 Seeds Development With Eclipse

    4/16

    You can now see the project's folder on the left panel. Eclipse shows the libraries used with theproject as well as the source folder.

    Next, we need to add the PGAF jar libraries to the project. Right-click on the project folder andselect Propertiesoption as shown in Figure 5.

    Figure 4:

    Figure 3:

  • 7/31/2019 Seeds Development With Eclipse

    5/16

    Click on Java Build Path on the Dialog's left panel. The tabs show option to link the project tosource code from other projects, other eclipse projects, and to Java libraries. Click on the Libraries tab.You should see something similar to Figure 6.

    Click on Add Library...button (Figure 7). ClickUser Library...button. The new dialog shows anempty list of the User Libraries. Then clickNew...button and give the name Seeds to the library; thenclickOK. Figure 7 shows the windows involved in this step; the number show the order of the actions.

    Figure 5:

    Figure 6:

  • 7/31/2019 Seeds Development With Eclipse

    6/16

    While still on User Libraries dialog, click on Add JARs.. button. In the file manager dialog,navigate to the path where you place the PGAF libraries. Select all the jar's inside the lib folder(Crt+A).

    Figure 7:

  • 7/31/2019 Seeds Development With Eclipse

    7/16

    You should see something like the picture shown in Figure 8. Notice that each jar file has a list ofattributes. Eclipse allows us to set the path for the source code of the each jar file, and the javadoc. Wewill set the javadoc path for the PGAF library to have the real-time, in-line javadoc Eclipse feature.

    Locate the jar file called "seeds.jar". double-click on the attribute Javadoc location. Figure 9 showsDialog you should see. in the field "Javadoc locatio path:" enter the URL: http://coit-grid01.uncc.edu/seeds/javadoc/. You can click Validate... button to have eclipse check if the URLindeed points to a javadoc location. The URL points to a Javadoc that is human readable, you can go tothe site and browse for information on the Framework's classes.

    Figure 8:

  • 7/31/2019 Seeds Development With Eclipse

    8/16

    Click OK's and Applys all the way to the workbench, and your done setting up the coding part ofthe development environment. We still need to set up some settings in order to run a program, but at themoment to code.

    Notice you can add Javadoc for the Jxta library and other libraries. We showed how to get it forPGAFramework because it significantly improves your ability to understand and code using Seeds.

    Add the Cog library by repeating the steps done to create the PGAF library.

    The next section goes over creating a simple workpool application.

    Selecting a Template

    At this point, the programmer should evaluate the type of problem being solved by the framework.The programmer then, decides on a specific framework interface that will provide the most benefit inimplementing a solution. At the moment, the framework only offers a workpool solution, the generictemplate and other templates will be available as needed. For information on how to create an AdvanceTemplate, go here.

    Figure 9:

  • 7/31/2019 Seeds Development With Eclipse

    9/16

    Coding

    We will create three classes: a class that implements the interface called MyModule, a class thatextends the Data object, and a class that executes the module. Right-clickon the project's folder, selectNew->Class. The New Java Class Dialog will open up.

    Type the class name MyModule in the Name Field. To implement the Workpool interface click onBrowse...button for the Superclass option. provide the name of the package as edu.institution.

    Figure 10:

  • 7/31/2019 Seeds Development With Eclipse

    10/16

    Start typing Workpool. As you type, the Interface edu.uncc.grid.pgaf.interfaces.basic.Workpoolclass will be among the reduced list shown. Select that class from the list.

    Implementation note:Although the initial intention was to use Java Interfaces for the PGAF Interface, there are some

    features used from Java Abstract classes that made the use of classes necesary for PGAF Interfaces. The PGAF Interface

    does not equate to a Java Interface, and therefore the user's module should use extend keyword instead of implements.

    Create another class named MyData by folloing the same steps done to create MyModule class. It has

    to implement the edu.uncc.grid.pgaf.datamodules.Data interface, so use the Interface options instead

    of the Superclass option. This is done by clicking on the Add... button below browse button.Then create an executable Java Class named RunMyModule.java. Notice there is a check box in the

    Dialog "New Java Class" that will create the main() function automatically, check the check box forRunMyModule Class. Alternatively, you can also add the static main method by writing it.

    Double-click on MyModule file. Notice Eclipse already filled in the functions that you shouldimplement for the interface. You can hover you mouse around a PGAF class like "Workpool" and thein-line javadoc will provide you with some information on the purpose of the interface and goodprogramming practices when using it. You can try the trick on the functions. Hover the mouse onCompute() function for example.

    Figure 11:

  • 7/31/2019 Seeds Development With Eclipse

    11/16

    The files used for this tutorials are at http://coit-grid01.uncc.edu/seeds/download/ide_tutorial/hostname.zip. You can replace the files with the oneprovided to continue, or you can copy and paste the code. The code in the sample just runs analgorithm on the compute() function to find out the hostname where compute() is running. It then putsthe name in a MyData object and sends it back to the server node. The DiffuseData() function sends anempty data object. The GatherData() function receives the hostname in the MyData object and prints itto the standard output. getDataCount() request the framework to run 10 jobs.

    Source files used for the exampleThe code in RunMyModule uses the Seeds class. The Seeds class requests an important parameters.

    The parameter is the path to the seed folder. This is the folder that will be sent to each of the grid nodesbefore submitting a job to run them and boot up the network. The folder contains the jar files for thementioned libraries, plus a couple of configuration files. The available_servers file list the availablehost that you want to use to deploy Seeds. In a production environment, one can imagine this file beinggenerated by a scheduler. The second file is the seed_file, this file has the address of the rendezvousjxse servers, this is used to get the network connected. At least one server per grid node must have anaddress written into the seed_file. A sample copy of both files is included in the PGAF folder.

    Running the Program

    In order to run the program, we need to create the seed folder. The seed folder is the framework's(C:/lib/Seeds or/home/username/lib/Seeds) folder that contains the library files plus the project

    we just created, which at the moment is at $HOME/workspace or C:/Documents and

    Settings/username/workspace . Copy the contents inside the bin folder inside

    workspace/MySeedsProject to the seed folder. The contents should preserver Java's package naming

    convention; otherwise, the Java VM will not find them and a ClassNotFoundException will be thrown.Alternatively, you can create a jar for your project and add the jar file to the shuttle folder. To do this,right-click on the project's folder and select Export... option, follow the wizard from there, and place

    Figure 12:

    http://coit-grid01.uncc.edu/pgaf/download/ide_tutorial/hostname.ziphttp://c/lib/Seedshttp://coit-grid01.uncc.edu/pgaf/download/ide_tutorial/hostname.ziphttp://c/lib/Seeds
  • 7/31/2019 Seeds Development With Eclipse

    12/16

    the jar in the shuttle folder.

    Edit the file AvailableServer.txt to have the localhost line uncommented (Figure 13 ).

    We will first use the framework withing the workstation only. Make sure there are no entries on theseed_file, which is inside the seed folder. Make note of the path for the seed folder because it will be

    used as an argument to run program.

    Create a new run on Eclipse by clicking on the Run Button's arrow. Figure 14 shows the Run

    Button, the arrow appears as you hover the mouse on the button. Select Run Configurations....

    On the "run Configurations" dialog, clickNew Run Configurationbutton as shown on Figure 15.Name the new configuration MySeedsTest. The Project and Main Class field should be automaticallyfilled. If not, select project MySeedsProject, and the main class is RunMyModule with its completeclass name.

    Figure 13:

    Figure 14:

  • 7/31/2019 Seeds Development With Eclipse

    13/16

    Switch to the Arguments tab, and add the path to the seed folder as an argument.

    Figure 15:

    Figure 16:

  • 7/31/2019 Seeds Development With Eclipse

    14/16

    We are ready to test the program. We now need to create to get a grid proxy. To do this, from thecommand line type:

    grid-proxy-init

    C:/lib/COG/bin (windows) /home/username/lib/COG/bin (linux) must be on your PATH variable for

    the previous command to work. You can do this in Linux by typing:

    export PATH=$PATH:/home/username/lib/COG/bin

    in windows

    set PATH=%PATH%:C:\lib\COG\bin

    The environmental variable setting is temporal for the session only. They need to be added to

    /etc/profile in Linux. In windows, you can click Start->Right-click MyComputer->Properties.

    Then, you can navigate to Environmental Variables and edit Path.

    A certificate must also be present at the workstation forgrid-proxy-init to work. To do this, you can

    use the program WinSCP to copy the certificate files from the the Grid organization you belong to. The

    files are the usercert.pemand userkey.pemwhich are located in the .globus directory at the server

    where you requested the certificate. Copy those file to /home/username/.globus

    Run the application by clicking on the Run Button. There should be logging information from the

    JXSE library printing on Eclipse's Console panel. The output for the test application will be printed

    along with the log information. You can pipe the standard error to a separate file to clean up the output.

    Testing and Debugging

    Program debuggingThe framework is a prototype, some unexpected errors can happen as you introduce creative uses

    for th e presented material. The seed folder will keep a log at each of the machines. The seed foldername starts with the host name to prevent host sharing a file system from conflicting. In that folder,you can extract PGAF.log (PGAF being the initial name of the framework). In it a verbose descriptionof the framework actions will be given, but mostly focus on exceptions thrown. To run a program forperformance, turn off the logging feature by calling the command Node.getLog.setLevel( Level.INFO)or any other level at the modules InitializeModule() method.

    Any Exception related to your code will be routed to the standard output. You can also sendinformation via this channel by using RemoteLogger.printToObserver() function withing your module.

    Communication

    Make sure ports are opened before running this tutorial. Also, the NAT circumventing features isnot fully tested, so test the framework on nodes that can clearly reach each other first. The frameworkuses UPNP to get accesss from your workstation, if your workstation is connected to a router withUPNP turn off, the program is likely to experience connection problems.

    If your client is using wireless, the Hostspot provider may have close some of the ports used by theframework. The error will create an TimeoutException

  • 7/31/2019 Seeds Development With Eclipse

    15/16

    java.net.SocketTimeoutException: connect timed out

    Assumptions And Issues

    The framework is a prototype. It makes some assumptions about the Grid nodes. It assumes theGrid nodes's submit host will also host a peer, and that peer will listen on port 50040. It also assumes

    the Grid nodes will schedule the jobs to run almost immediately. Fork, and Condor deployment aresupported. No interface exist to communication with meta-schedulers.

    The framework may not correctly identify the Network Interface on a Virtualized system.

    Multicast should be enabled on the Grid nodes to let the network choose a leader. Alternatively, ashared file system can be used to select a leader.

    Running the Program (Command Line)

    If you are not using an IDE, you can run PGAF projects using the command line. Make sure all thepertinent class and jar files are in the CLASSPATH. This is done different for each platform. Scripts toset up Cog and Seeds on the CLASSPATH are provided at Seeds home page at:

    http://coit-grid01.uncc.edu/seeds/download/scripts/set_pgaf_environment.sh

    and

    http://coit-grid01.uncc.edu/seeds/download/scripts/set_pgaf_environment.bat

    The Linux command is run by typing:

    source set_pgaf_environment.sh /path/to/cog/libraries/ path/to/shuttle/folder/

    The scritp may need to be set to executable with the command:

    chmod +x set_pgaf_environment.sh

    Running on the Local Machine (Command Line)

    Lastly, we can run the program. To run the framework locally make sure there is only the local host

    mentioned in the Available Server file. Then, perform the following command:

    java -classpath $CLASSPATH my.institutions.RunMyModule AvailableServers_test.txt

    gpaf_shutle_folder_path

    The framework will take a few seconds to load, the P2P log is outputed to the standard error. In Linux

    the stderr can be redirected using 2> error.file.txt.

    java -classpath $CLASSPATH my.institutions.RunMyModule AvailableServers_test.txtgpaf_shutle_folder_path 2> error.file.txt

    The output should have the name of your local workstation 10 times.

    Running on a Remote Machine

    Next, we can run the program on a remote Grid node. For this, the user needs a certificate from a Grid

    node and have the certificate in ~/.globus location. Getting a grid certificate and getting access to grid

    http://coit-grid01.uncc.edu/seeds/download/scripts/set_pgaf_environment.shhttp://coit-grid01.uncc.edu/seeds/download/scripts/set_pgaf_environment.shhttp://coit-grid01.uncc.edu/seeds/download/scripts/set_pgaf_environment.shhttp://coit-grid01.uncc.edu/seeds/download/scripts/set_pgaf_environment.sh
  • 7/31/2019 Seeds Development With Eclipse

    16/16

    resources is beyond the scope of this tutorial. We refer the reader to the Globus website for that

    information.

    The bin folder from the cog libraries downloaded at the beginning of this tutorial should be the PATH.

    The Linux script provided here should set that for you. You can also set the path with the the command:

    export PATH=$PATH:/path/to/cog/lib/bin/folder

    For Windows, Click the Startbutton, Right-click on My Computer. Select Properties, go to advanced

    tab, select Environmental Variables, and edit PATH to include the cog lib bin folder. Make sure not to

    have spaces as this is known to cause errors.

    Now we need to get a proxy. Type:

    grid-proxy-init

    on the command line. When prompted, type the certificate's passphrase. This creates the proxy we need

    to run jobs on the external grid node.

    Add the grid node job submit host to the Available servers file following the rules specified in the file.

    Take out the local machine entry. We assume there will be a Rendezvous node at the remote node, so

    also add a host to the seed_file with the port 50040. put at least two computers or CPU cores on the

    remote machines's description. Now we can repeat the same command as it was done with the local

    machine

    java -classpath $CLASSPATH my.institutions.domain.RunMyModule

    AvailableServers_test.txt gpaf_shutle_folder_path

    This time, it should take longer. But after a few minutes, the result should come back.

    It is possible to run a local machine together with a list of remote hosts. For this example, because the

    amount of work is so small, the local machine would have done all the jobs before the grid nodes come

    on line.

    http://www.globus.org/http://www.globus.org/