1. Potential fix on certain bash command in notebook

Potential fix on certain bash command in notebook

Home Forums FABRIC General Questions and Discussion Potential fix on certain bash command in notebook

Viewing 15 posts - 1 through 15 (of 17 total)
  • Author
    Posts
  • #3347
    Yingqiang Yuan
    Participant

      I have created a stableok slice but running that block of command yields issues. Is there something I should check?

      #3350
      Donald Petravick
      Participant

        Similarly,  I am working with the libraries on Linix/MAC.    I had working  code break as of Friday.

        the submit emits errors:

        WARNING:root:Failed to get layer: ‘str’ object has no attribute ‘layer

        My attempt to  set a route after submit

        (Pdb) network.get_subnet()
        WARNING:root:Failed to get layer: ‘str’ object has no attribute ‘layer’

        I’m no expert, but the network.slice.slice is serialized java and looks like all the network provisioning was done on the server,  but maybe on Friday, there was a change on the FABRIC server-side on how all this was serialized when it was sent to the client, and not it’s not de-serializing?

        #3352
        Paul Ruth
        Keymaster

          @Donald – Which site did your VM go to? I suspect your problem is that your VM landed on CLEM, FIU, GPN, or UCSD.  These sites are not fully deployed yet and the networks will always fail.

          Try adding avoid=['CLEM', 'FIU', 'GPN', 'UCSD'] to your calls to add_node or get_random_sites.   You might also avoid STAR and MAX while we debug an issue with their dataplane switches.

          #3353
          Paul Ruth
          Keymaster

            @Yingqiang – Its hard to tell what is happening here.  It looks like you are trying to ssh with iPython/magic command. I am not familiar with these yet. The error is Host key verification failed. However, it looks like you are passing StrictHostChecking=no and UserKnownHostsFile=/dev/null. These parameters should instruct the system to skip host key verification. I suspect those parameters are not being passed correctly.

            Are you able to ssh from a regular command line? Can you use the node.execute() command in fablib?

            #3354
            Donald Petravick
            Participant

              no, the test is

              two nodes at NCSA,

              two at TACC. Each with

              a L3 IPV4 network

              after the submit, I go to set up routes so that all nodes talk to all other nodes.

              When I get the network object to find things I need for routing the network.get_gateway() and network.get_subnet() calls  give the errors like

              (Pdb) network.get_subnet()
              WARNING:root:Failed to get layer: ‘str’ object has no attribute ‘layer’

              It broke on Friday, so I suspected a change in the downtime,  but after Greg and I worked together today, I see the software still works at NERSC, so I am beginning to wonder about my (rickety) python install on my mac…  So let me clean house a bit and see what I learn.

              #3369
              Ilya Baldin
              Participant

                @Donald for a controlled stable environment consider using Jupyter Hub. The added benefit is a reproducible experiment in a notebook. You can start with one of the notebooks we provide and modify it.

                • This reply was modified 1 year, 6 months ago by Ilya Baldin.
                #3371
                Yingqiang Yuan
                Participant

                  I am not aware of how to use node.execute().  However, my ssh command yield by the successful creation of a slice does not work properly. @Paul Ruth

                  #3374
                  Paul Ruth
                  Keymaster

                    @Yingqiang  –  Have you tried the FABRIC example notebooks that come pre-installed in your JupyterHub environment?  They all use node.execute().   Try the Hello, FABRIC example.

                    There are videos that walk you through some: https://www.youtube.com/playlist?list=PL64VqyRjOwSFaDlX-bk7KXAiiCF3FP4vv

                    • This reply was modified 1 year, 6 months ago by Paul Ruth.
                    #3399
                    Yingqiang Yuan
                    Participant

                      @Paul I have run through the Hello,FABRIC example and it yields the same error where I can create a slice but cannot ssh into the nodes. I suspect it has something to do with the keys because the error message is “Permission denied (publickey,gssapi-keyex,gssapi-with-mic).kex_exchange_identification: Connection closed by remote host”. Regarding the key, I have created the key this August at the earliest and should not expire.

                      #3400
                      Paul Ruth
                      Keymaster

                        This definitely has something to do with your keys or the fabric_rc configuration file.

                        Can you confirm you bastion key is valid? See the trouble shooting section of this article: https://learn.fabric-testbed.net/knowledge-base/logging-into-fabric-vms/

                        Can you re-run the configure environment example notebook? This will recreate your fabric_rc file. Note that you will need to set the correct paths to where you uploaded your known good bastion key.

                        Let me know if either of these work.

                        #3402
                        Donald Petravick
                        Participant

                          @ilya @paul, so I moved to NERSC where Greg Daues is working successfully (and where we’d work to tie our prototype to the larger system,.) The good news is it works! (well almost). If Greg and l land on the same cluster head node, we conflict in /temp/fabric. (one of us owns the directory, the other cannot access it) I put a two line change fablib.py in my distribution to be sensitive to an environment variable to keep working. (see below with some surrounding lines for th change) I see there is a framework that tries to get FABRIC_LOG_FILE from a configuration setup, But FABRIC_LOG_FILE is not set-able from the environment. AFAICT.
                          is there a supportable way or me to change the logfile/data directory in the current code? If not are my changes useful to you?

                          all in all this is working out very well for us.

                          LOG_LEVELS = {
                          ‘DEBUG’: logging.DEBUG,
                          ‘INFO’: logging.INFO,
                          ‘WARNING’: logging.WARNING,
                          ‘ERROR’: logging.ERROR,
                          ‘CRITICAL’: logging.CRITICAL
                          }

                          default_fabric_rc = os.environ[‘HOME’] + ‘/work/fabric_config/fabric_rc’
                          default_log_level = ‘DEBUG’
                          default_data_dir = os.environ.get(‘FABRIC_DATA_DIR’,’/tmp/fablib’) #change near line526 fablib.py
                          default_log_file = os.path.join(default_data_dir,’fablib.log’) # change

                          fablib_object = None

                          ssh_thread_pool_executor = None

                          #3404
                          Paul Ruth
                          Keymaster

                            You can set it in your fabric_rc file.

                            I had a similar issue with this when I run multiple fablib apps at the same time. What I did was to change the export in the fabric_rc to the follow so that the log is in the same folder at the executable.

                            export FABRIC_LOG_FILE=fablib.log
                            export FABRIC_LOG_LEVEL=DEBUG 

                            Right now some of the config code is a not tested as well as could be. We will get this smoothed out soon.

                            • This reply was modified 1 year, 6 months ago by Paul Ruth.
                            #3406
                            Yingqiang Yuan
                            Participant

                              @Paul I have tried both methods and here are the issues that I am running into. I ran the configure_enviroment and when I try to ssh into a host, it reports that “Warning: Identity file /home/fabric/.ssh/fabric_bastion not accessible: No such file or directory. yyuan76_0000024961@bastion-1.fabric-testbed.net: Permission denied (publickey,gssapi-keyex,gssapi-with-mic).” I checked the link you recommended for me and configure_enviroment.ipynb and I believe that none of them involve in creating files in the ~/ssh directory. Should it be concerning

                              #3407
                              Paul Ruth
                              Keymaster

                                @Yingqiang – Can you confirm that you are using the newest version of the jupyter-examples? They should in your jupyterhyub container in a folder called “jupyter-examples-rel1.3.3”.

                                #3411
                                Yingqiang Yuan
                                Participant

                                  @Paul Previously I was mainly following the instruction from the URL you just listed: https://learn.fabric-testbed.net/knowledge-base/logging-into-fabric-vms/ . Also, I have been trying to figure out the difference between that link and configure_enviroment.ipynb in jupyter-examples-rel1.3.3. It would be great if you can offer me some insight into the difference between them. Meanwhile, I apologize for not mentioning that there was one time that I can actually ssh into the node. Are you sure that I absolutely need to delete all the key pairs etc and start from scratch?

                                Viewing 15 posts - 1 through 15 (of 17 total)
                                • You must be logged in to reply to this topic.