These changes are more than acceptable from a developer, or small company perspective. However I am afraid that for our needs we require something more.
For instance DUCC and unpacked.cern.ch are definitely above the 200 pulls every 6 hours limit. And unfortunately I donāt see any way to bring them down without increasing the latency between publication on the docker hub and publication on unpacked.cern.ch.
I am also afraid that a lot of scientific code is just forgotten in some docker hub image, and that code will eventually be lost.
Gitlab registries provide a nice solution to this problem. But rely on them should be a conscious and deliberate choice, not just the result of inertia and using the most convenient tool at the moment.
CERN has the capacity to operate an image registry for our needs and not doing so should be our deliberate choice.
I see a risk on the use of Gitlab registries. They are not the core business of gitlab but a complementary service. There is no long-term guarantee on their availability and support. Moreover, if the financial situation of gitlab changes, they could decide to move them in another price tier too expensive for CERN.
I would like to know what is your opinion on the matter.
We already have registry.cern.ch which is based on Harbor (a CNCF graduated project). It is not used for container images today but for other OCI artifacts like helm charts, gitlab remains as the recommended image registry.
Thereās ongoing discussion and work to improve the registry functionality we have at CERN, including vulnerability scans, artifact signing, among many others - and also how to integrate gitlab with an external registry.
I donāt think we need to worry about gitlabās commitment to its registry, thereās a clear and easy transition if it would ever be necessary.
At least for the time being, registry.cern.ch seems to be available only internally. What we have in mind is a cvmfs-enabled container registry for the benefit of WLCG sites.
Yes, and itās not used for docker images. I was pointing out that the transition from gitlab wouldnāt be hard if ever needed. I didnāt get this was a request for a cvmfs backed registry from the previous message.
Our small-ish k8s cluster (~ 500 cores) was often getting 504 gateway timeouts when pulling images from CERN gitlab. This is with the Docker graph driver plugin, so each āimageā is really only a few KB of JSON.
Changing from using the ālatestā tag to an explicit tag sidestepped the problem by avoiding the need to check for image updates, which is the best practice for production anyway. But it doesnāt make me feel confident about large scale use of the gitlab registry.
In ATLAS we are talking about organising a meeting to talk about this. Itās clear that if we move stuff to CERN we need to ask for resources. Gitlab, as Ryan says, is not powerful enough, not even for small images so we need to either get gitlab in shape or have a different registry and it needs of course to be discussed with you guys for the interaction with CVMFS. Itās likely 1 meeting will be not enough.
we raised the issue in the IT-ATLAS meeting and this was brought tothe attention of CERN-IT. It needs an executive decision from CERN-IT management. The more groups make noise the more likely is we get resources.
The Docker company has done so much for us, I donāt think we should begrudge giving them $60 per year. Think of it is as community support. Even a 1-hour meeting with CERN-IT will cost CERN way more than that in people time. Itās not even worth the paperwork to get reimbursed for that amount. I volunteer to personally create and pay for an account if no-one else wants to do it.
they did indeed do a lot for us but after looking at the pricing my understanding is that with $60 a month only the authenticated owner has unlimited downloads. Which is I donāt think it is workable.
Docker pricing for the Pro plan is $5 per month. I am assuming that containers will be distributed to the grid through cvmfs, so we should mainly care only about the cost to download to unpacked.cern.ch and its successor for distributing per-user containers. gitlab.cern.ch would never be able to reasonably sustain a rate of downloading individual user containers to grid nodes.
Ok the separate issue is long-term archival of containers. I agree it makes sense to use gitlab for that, but if users only upload there containers that they want to keep long term, itās not clear that gitlab isnāt already adequate for that.
I agree. Their current pricing makes some tasks still possible, e.g. converting images to unpacked.cern.ch. But we cannot have the grid getting images from dockerhub (which is anyway not a good idea). For archival, having them in cvmfs is an option, too.