Forum Replies Created
-
AuthorPosts
-
Power is back on, however some services are still off-line waiting to be restored.
We are investigating, thank you for letting us know. I was able to start it just now – perhaps give it another try? (we will look into the cause of the original failure)
That works too. Do definitely let us know. These are early hiccups, we obviously expect to have fewer of those as we go.
@Fengping we are still trying to figure out the problem. Can you use other sites in the meantime (Utah?) while keeping this slice for now – we don’t want to hold up your work in the meantime.
As far as the site OpenStack is concerned the VMs are active and presumably fine. We are trying to figure out why management access isn’t working. This has happened at MAX, as you know, we are not sure why yet – it does not appear to affect other sites. We will continue looking into it.
@Fengping – you are sure you were able to access this slice before? Logs indicate the VMs may not be fully booted – and must have been this way for a while.
We are looking into it.
Ilija – as Paul indicated, the near term (like within a month, or two) plan is to introduce automatic bastion key rotation and extend the user-facing API so you can easily find out your username and get updated keys. This is in the works (we are integrating these capabilities as we speak). For now we are practicing a more manual approach to key and account management on bastion hosts that is compatible with the long-term architecture (which is what requires these usernames – as you can imagine, in FABRIC we cannot guarantee your regular username is unique). We ask for a bit of your patience, since we are officially not in operations yet.
I’ve garbage collected everything that wasn’t active at MAX. Let me know if you can still reach your slice.
If you have names and sites for those slices it would help me look.
Yes, this probably leaked during the restart. I suggest you start a new one. I will see if we can garbage collect the old slice. Are there other slices?
@Paul this may be the answer to your question about leaked cards – even though I thought I shut down all slices, I clearly missed something.
November 11, 2021 at 1:20 pm in reply to: Maintenance on FABRIC Production Sites Today (Nov 10) #981We have brought up TACC, NCSA, STAR, UTAH and MAX, and are in the process of testing them. Please refrain from use for the moment.
November 11, 2021 at 12:23 am in reply to: Maintenance on FABRIC Production Sites Today (Nov 10) #979We are experiencing problems with restarting a couple of sites. More updates to come Nov 11 AM.
November 10, 2021 at 10:52 pm in reply to: Maintenance on FABRIC Production Sites Today (Nov 10) #978All elements of control framework have been shut down on production infrastructure. We are working on restarting everything.
The VMs are up, but for some reason not reachable. Operations is looking into it.
aa8eda10-6c27-42a5-ad89-6e8a06948203-Node2 instance-00000534 management-2004=10.20.4.228, 63.239.135.87
e52f1301-52f1-4eb4-bd26-c92fc1b66664-Node1 instance-00000533 management-2004=10.20.4.203, 63.239.135.114 -
AuthorPosts