Skip to content

Proposal for DaemonSet deployment of Prometheus Agent #6600

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jun 20, 2024

Conversation

haanhvu
Copy link
Contributor

@haanhvu haanhvu commented May 16, 2024

Proposal for DaemonSet deployment of Prometheus Agent

@haanhvu haanhvu requested a review from a team as a code owner May 16, 2024 13:54
Copy link
Contributor

@simonpasquier simonpasquier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great!

Copy link
Member

@ArthurSens ArthurSens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome start!

I have a few small comments but the proposal already looks pretty good!

I think we could already start refactoring our codebase to extract common configuration that will be used by both statefulset and daemonset modes :)


The current (StatefulSet) deployment brings long the corresponding pitfalls:
* Load management & scalability: Since one or several high-availability Prometheus Agents are responsible for scraping metrics of the whole cluster, users would need to calculate/estimate the load and scalability of the whole cluster to decide on replicas and sharding strategies. Estimating cluster-wide load and scalability is a much harder task than estimating node-wide load and scalability.
* Security: Similarly, cluster-wide security is a much bigger problem than node-wide security.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we provide examples of how cluster-wide attacks can be avoided with daemonsets?

Copy link
Contributor Author

@haanhvu haanhvu May 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I'm thinking of removing Security from Why and Pitfalls of current solution.

First reason is AFAIU StatefulSet and DaemonSet face different kinds of security issues. In DaemonSet, Prometheus Agent pod knows all secrets and share them with all other pods. In StatefulSet, the complexity of security is at cluster scope and we have to deal with network issues. So we can hardly say one is better than the other regarding security.

Second reason is security is not the key reason we choose to implement DaemonSet.

You agree to remove it from our consideration?

@simonpasquier @kakkoyun

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the comment above, I'm in favor of removing or at least re-writing the security concerns. I don't think we should be saying that one is better than the other, but acknowledging how they are different.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should remove the security section. I don't think our primary goal is to address this. I don't even see how it would be more secure over statefulset approach. I'd even argue that it would be more insecure. But again I think we should drop it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed it. @simonpasquier let us know if you have different opinions.

* Scraped load is very large or hard to estimate.
* Scalability is hard to predict.
* Security is a big concern.
* They want to collect node system metrics (e.g. kubelet, node exporter).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here it sounds like we can't collect kubelet/node exporter metrics with statefulset, which isn't true 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we can do it with kubernetesSDConfigs in ScrapeConfig? If there's no advantage of solving this use case with DaemonSet, I'll remove it then.

Copy link
Member

@ArthurSens ArthurSens May 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also use ServiceMonitor/PodMonitor, just need to adjust labels between services/pods :)

Yeah, I'd remove this part

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we can do it with kubernetesSDConfigs in ScrapeConfig?

I meant we can do this in the StatefulSet mode. This is the go-to solution in StatefulSet right?

@haanhvu
Copy link
Contributor Author

haanhvu commented May 20, 2024

@kakkoyun @simonpasquier @ArthurSens I resolved the reviews, also left comments for the reviews not resolved yet. Pls take a look.

@bwplotka @pintohutch If you have time, do you mind taking a look at this too? ^^

@bwplotka
Copy link
Contributor

Nice, will check tomorrow latest 🤞🤞🤞🤞

@kakkoyun
Copy link
Member

On top of my to-do list 👍

Copy link
Member

@ArthurSens ArthurSens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really good work 🥳

There are a few things that we could clarify, but, as mentioned before, looks good enough to start working on the codebase already :)


## 1. Why

When deploying Prometheus Agent in Kubernetes, three of the biggest users’ concerns are: load distribution, scalability, and security.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, have you seen this statement somewhere? If yes, would be nice to have a reference here :)

Copy link
Contributor Author

@haanhvu haanhvu May 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't cite it from a source. I read users' blogs, browsed the issues in our repo, and from my experience of setting up some nodes (not a k8s cluster though) for benchmark pipelines for Jaeger in last year GSoC to form this general observation.

There're of course other concerns too, like cost^^ But I'm not sure DaemonSet could help anything with reducing cost, so didn't state it here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some way it helps a lot with the cost, because it scales with the load, so you don't need to keep buffy Prometheus servers when your cluster is scaled back. Of course some mix of vertical and horizontal scaling for thousands small agents would be the best from the cost perspective, but we will never be able to do this with scraped metrics (and it's fine). Daemon set is somewhat in this direction while maintaining some other pros of stable collection 🤗

Copy link
Contributor Author

@haanhvu haanhvu May 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some way it helps a lot with the cost, because it scales with the load, so you don't need to keep buffy Prometheus servers when your cluster is scaled back

Yeah I mentioned automatic scaleup, but forgot to mention automatic scaledown. I'll add this to Scalability section then. We don't have any proof of cost so I wouldn't mention cost here. But automatic scaledown implicitly refers to cost optimization (and environmental benefits too, hopefully ^^).


The current (StatefulSet) deployment brings long the corresponding pitfalls:
* Load management & scalability: Since one or several high-availability Prometheus Agents are responsible for scraping metrics of the whole cluster, users would need to calculate/estimate the load and scalability of the whole cluster to decide on replicas and sharding strategies. Estimating cluster-wide load and scalability is a much harder task than estimating node-wide load and scalability.
* Security: Similarly, cluster-wide security is a much bigger problem than node-wide security.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the comment above, I'm in favor of removing or at least re-writing the security concerns. I don't think we should be saying that one is better than the other, but acknowledging how they are different.

kakkoyun
kakkoyun previously approved these changes May 21, 2024
Copy link
Member

@kakkoyun kakkoyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM. I'll have another look at it but this shouldn't be a blocker.

One additional thing I'd like to mention is Grafana Agent. We should check how did they approach it (they already support several deployment approaches). What consideration did they make? We can even try to reach out to people and ask about the tread-offs.

## 5. Non-Goals

The non-goals are the features that are not easy to implement and require more investigation. We will need to investigate whether there are actual user needs for them, if yes, then how to best implement them. We’ll handle these after the MVP.
* ServiceMonitor support: There's a performance issue regarding this feature. Since each Prometheus Agent running on a node requires one watch, making all Prometheus Agent pods watch all endpoints will put a huge stress on Kubernetes API server. This is the main reason why GMP hasn’t supported this, even though there are user needs stated in some issues ([#362](https://github.com/GoogleCloudPlatform/prometheus-engine/issues/362), [#192](https://github.com/GoogleCloudPlatform/prometheus-engine/issues/192)). However, as discussed with Danny from GMP [here](https://github.com/GoogleCloudPlatform/prometheus-engine/issues/192#issuecomment-2028850846), ServiceMonitor support based on EndpointSlice seems like a viable approach. We’ll investigate this further after the MVP.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a non-goal but it'd be nice to attack this if we ever finish our planned goals before the program ends.

Copy link
Contributor

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good generally! Suggested some wording changes, but the plan sounds good! Great work and amazing to see this work moving forward 💪🏽

I think the main challenge here is to make sure to not confuse users with too many/complex configuration. Making it as easy to use as possible, discover configuration pieces and debugging when something is misconfigured. Not sure what can replace a fresh CRDs honestly, but maybe more focused docs/guides on this mode would be good enough! 🤗

(Disclaimer: I work for Google Cloud Managed Prometheus Team)


## 1. Why

When deploying Prometheus Agent in Kubernetes, three of the biggest users’ concerns are: load distribution, scalability, and security.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some way it helps a lot with the cost, because it scales with the load, so you don't need to keep buffy Prometheus servers when your cluster is scaled back. Of course some mix of vertical and horizontal scaling for thousands small agents would be the best from the cost perspective, but we will never be able to do this with scraped metrics (and it's fine). Daemon set is somewhat in this direction while maintaining some other pros of stable collection 🤗


When deploying Prometheus Agent in Kubernetes, three of the biggest users’ concerns are: load distribution, scalability, and security.

DaemonSet deployment solves all these three concerns:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
DaemonSet deployment solves all these three concerns:
DaemonSet deployment significantly improves on all of these three concerns:

It's not perfect. It does not solve load distribution to a single series, or even to a single target. It's a pragmatic solution to can work in a good enough manner 99.9% of users. (:

DaemonSet deployment solves all these three concerns:
* Load distribution: Each Prometheus Agent pod will only scrape the targets located on the same node. Even though the targets on some nodes may produce more metrics than other nodes, the load distribution would be reliable enough.
* Automatic scalability: When new nodes are added to the cluster, new Prometheus Agent pods will be automatically added in the nodes that meet user-defined restrictions (if any).
* Security: Since the scraped targets are local to the Prometheus Agent pod (on the same node), the scope of security problems is reduced to each node.
Copy link
Contributor

@bwplotka bwplotka May 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in practice... very hard to scale and audit. 🙃 But in theory there could be some security improvements, not sure if I agree with the explanation though (@pintohutch added good arguments below).

@TheSpiritXIII is helping a lot to get us to a better place here, and Prometheus Operator will be better with those changes e.g. prometheus/prometheus#13956

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the scraped targets are local to the Prometheus Agent pod (on the same node), the scope of security problems is reduced to each node.

Can you give an example? I would think it would actually go the other way.

What if there are exploitations in Prometheus? Wouldn't that compound the security problems in the cluster by the number of nodes (i.e. an attacker now can exploit on any node in the cluster since the container is everywhere).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just thinking that the scope of security could be "isolated" to each node, because there's no inter-node communication (between Prometheus and the scrape targets). But after more digging I realized this was too naive view. Scratched it here (and in next commit): #6600 (comment)


The non-goals are the features that are not easy to implement and require more investigation. We will need to investigate whether there are actual user needs for them, if yes, then how to best implement them. We’ll handle these after the MVP.
* ServiceMonitor support: There's a performance issue regarding this feature. Since each Prometheus Agent running on a node requires one watch, making all Prometheus Agent pods watch all endpoints will put a huge stress on Kubernetes API server. This is the main reason why GMP hasn’t supported this, even though there are user needs stated in some issues ([#362](https://github.com/GoogleCloudPlatform/prometheus-engine/issues/362), [#192](https://github.com/GoogleCloudPlatform/prometheus-engine/issues/192)). However, as discussed with Danny from GMP [here](https://github.com/GoogleCloudPlatform/prometheus-engine/issues/192#issuecomment-2028850846), ServiceMonitor support based on EndpointSlice seems like a viable approach. We’ll investigate this further after the MVP.
* Storage: We will need to spend time studying more about the WAL, different storage solutions provided by Kubernetes, and how to gracefully handle storage in different cases of crashes. For example, there’s an [issue in Prometheus](https://github.com/prometheus/prometheus/issues/8809) showing that samples may be lost if remote write didn’t flush cleanly. We’ll investigate these further after the MVP.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about mixed deployment cases? Daemon set vs others? Would it be part of goal or non goal?

Copy link
Contributor Author

@haanhvu haanhvu May 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@simonpasquier once mentioned mixed modes cases. In general what additional things we need to do to enable mixed modes? I haven't been able to clearly see that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure, what matters is what intention your have here when testing/designing features. It feels mixed modes is a goal then?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by mixed deployment goals @bwplotka?
What I mentioned before is that someone could very well deploy "statefulset" Prometheus/PrometheusAgent resources alongside "daemonset" PrometheusAgent resources.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mentioned before is that someone could very well deploy "statefulset" Prometheus/PrometheusAgent resources alongside "daemonset" PrometheusAgent resources.

Exactly that - mixed deployment. I mean we should put that in goals section to keep allowing those cases 👍🏽

Copy link
Contributor Author

@haanhvu haanhvu May 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I described this in Pitfalls of the current solution and Audience sections in the last commit. But I'm thinking whether I should put it in Goals too.

@simonpasquier @ArthurSens @kakkoyun Should we clarify in How that in MVP we won't allow switching from a live StatefulSet to DaemonSet => if early adopters are using StatefulSet and want to deploy mixed mode, they will have to first deleting the StatefulSet object, then deploy DaemonSet, then deploy StatefulSet again?

AFAIU Goals/Non-goals are what guide the How section. If we need to clarify this in How then maybe we need to add mixed deployment in Goals then.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean by "mixed deployment" involves distinct resources scraping distinct targets. For instance,

# Normal Prometheus server scraping control-plane components
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  name: control-plane
spec:
  serviceMonitorSelector: {kubernetes.io/part-of: control-plane}
---
# DaemonSet Prometheus agent scraping data-plane components like kubelet
apiVersion: monitoring.coreos.com/v1alpha1
kind: PrometheusAgent
metadata:
  name: data-plane
spec:
  serviceMonitorSelector: {kubernetes.io/part-of: data-plane}
  mode: DaemonSet

In this case, there's no intersection between the sets of targets.

* Replica
* Shard
* Storage

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the MVP, we will simply fail the reconciliation if any of those fields are set.

Would that potentially break scraping if a user were to switch from mode: StatefulSet to mode: DaemonSet?

Copy link
Contributor Author

@haanhvu haanhvu May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If users want to switch from StatefulSet to DaemonSet, they would have to unset the unsupported fields (if they had set them in StatefulSet). Besides documentation, do you have any ideas on how to make this switch smoother?

I discussed with @simonpasquier and @ArthurSens about whether we should simply log, or completely fail the reconciliation when unsupported fields are set. We concluded that having a log might not be enough, because users might neglect it and keep thinking that the unsupported fields would work. Do you have any ideas on this?

Maybe we need a test case for this switch?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dare I say - a failing webhook on the CRD?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, very good point! It didn't occur to me that switch from statefulset to daemonset in a live object.

We might need CEL earlier than we thought? from my understanding, it works as an admission webhook

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a first approach, I'd consider that failing the reconciliation is good enough. If the validation can be modeled with CEL, I'd prefer to go this way (a validating webhook is more complex to manage).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CEL is nice because you don't need a separate webhook service to configure and route to.

Alternatively, you could add some new logic in go code to the admission webhook. This is a nice option if CEL is insufficient for what you want to check (e.g. you have some Prometheus library-based validation or something that cannot be codified easily in CEL).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW we have a "dedicated" CEL issue #5079


The non-goals are the features that are not easy to implement and require more investigation. We will need to investigate whether there are actual user needs for them, if yes, then how to best implement them. We’ll handle these after the MVP.
* ServiceMonitor support: There's a performance issue regarding this feature. Since each Prometheus Agent running on a node requires one watch, making all Prometheus Agent pods watch all endpoints will put a huge stress on Kubernetes API server. This is the main reason why GMP hasn’t supported this, even though there are user needs stated in some issues ([#362](https://github.com/GoogleCloudPlatform/prometheus-engine/issues/362), [#192](https://github.com/GoogleCloudPlatform/prometheus-engine/issues/192)). However, as discussed with Danny from GMP [here](https://github.com/GoogleCloudPlatform/prometheus-engine/issues/192#issuecomment-2028850846), ServiceMonitor support based on EndpointSlice seems like a viable approach. We’ll investigate this further after the MVP.
* Storage: We will need to spend time studying more about the WAL, different storage solutions provided by Kubernetes, and how to gracefully handle storage in different cases of crashes. For example, there’s an [issue in Prometheus](https://github.com/prometheus/prometheus/issues/8809) showing that samples may be lost if remote write didn’t flush cleanly. We’ll investigate these further after the MVP.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by mixed deployment goals @bwplotka?
What I mentioned before is that someone could very well deploy "statefulset" Prometheus/PrometheusAgent resources alongside "daemonset" PrometheusAgent resources.

* Replica
* Shard
* Storage

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a first approach, I'd consider that failing the reconciliation is good enough. If the validation can be modeled with CEL, I'd prefer to go this way (a validating webhook is more complex to manage).

Copy link
Contributor

@simonpasquier simonpasquier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Virtual approval from my side :)

Copy link
Contributor

@simonpasquier simonpasquier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. There are a few MarkDown issues reported by the linter that need fixing though.

@haanhvu
Copy link
Contributor Author

haanhvu commented Jun 3, 2024

LGTM. There are a few MarkDown issues reported by the linter that need fixing though.

Yeah, do you want to merge it now, or leave it open for a little while as we discussed?

@haanhvu haanhvu force-pushed the agent-daemonset-proposal branch from 2e8b843 to 9a8a583 Compare June 19, 2024 07:05
@haanhvu
Copy link
Contributor Author

haanhvu commented Jun 19, 2024

@ArthurSens @simonpasquier @kakkoyun I resolved all the comments. We have left this open for a while. Since there're no new reviews, I think we can merge this now.

Copy link
Member

@kakkoyun kakkoyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

I'll go ahead and merge it now. We can always send subsequent PRs if we change decisions.

@kakkoyun kakkoyun enabled auto-merge June 19, 2024 19:20
@haanhvu
Copy link
Contributor Author

haanhvu commented Jun 20, 2024

The auto-merge doesn't work, probably because the tests in CI don't run (seems like they only run in code-related PRs?)

@ArthurSens
Copy link
Member

The auto-merge doesn't work, probably because the tests in CI don't run (seems like they only run in code-related PRs?)

Correct 😅

@ArthurSens ArthurSens disabled auto-merge June 20, 2024 20:48
@ArthurSens ArthurSens merged commit 8cf75b8 into prometheus-operator:main Jun 20, 2024
9 checks passed
openshift-merge-bot bot pushed a commit to stolostron/prometheus-operator that referenced this pull request Aug 2, 2024
* fix: ScrapeClass TLSConfig nil pointer exception (prometheus-operator#6507)


Signed-off-by: Simon Pasquier <[email protected]>

* Update .github/workflows/stale.yaml

Co-authored-by: Jayapriya Pai <[email protected]>

* build(deps): bump github.com/prometheus/common from 0.52.3 to 0.53.0

Bumps [github.com/prometheus/common](https://github.com/prometheus/common) from 0.52.3 to 0.53.0.
- [Release notes](https://github.com/prometheus/common/releases)
- [Commits](prometheus/common@v0.52.3...v0.53.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/common
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* build(deps): bump golang.org/x/net from 0.21.0 to 0.23.0 in /scripts

Bumps [golang.org/x/net](https://github.com/golang/net) from 0.21.0 to 0.23.0.
- [Commits](golang/net@v0.21.0...v0.23.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>

* build(deps): bump golang.org/x/net from 0.22.0 to 0.23.0 in /pkg/client

Bumps [golang.org/x/net](https://github.com/golang/net) from 0.22.0 to 0.23.0.
- [Commits](golang/net@v0.22.0...v0.23.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>

* chore:cut v0.73.2

Signed-off-by: Jayapriya Pai <[email protected]>

Co-authored-by: Simon Pasquier <[email protected]>

* chore: update RELEASE.md instructions (prometheus-operator#6539)

* chore: update RELEASE.md instructions

Signed-off-by: Jayapriya Pai <[email protected]>

* Update RELEASE.md

Co-authored-by: Arthur Silva Sens <[email protected]>

---------

Signed-off-by: Jayapriya Pai <[email protected]>
Co-authored-by: Arthur Silva Sens <[email protected]>

* update golangci-lint version (prometheus-operator#6543)

Signed-off-by: dongjiang1989 <[email protected]>

* feat(xds): Add support nomad service discovery to the ScrapeConfig CRD (prometheus-operator#6485)

* add support for nomad sd

Signed-off-by: dongjiang1989 <[email protected]>

* fix generate checks

Signed-off-by: Jayapriya Pai <[email protected]>

* build(deps): bump golangci/golangci-lint-action from 4.0.0 to 5.0.0 (prometheus-operator#6547)

Bumps [golangci/golangci-lint-action](https://github.com/golangci/golangci-lint-action) from 4.0.0 to 5.0.0.
- [Release notes](https://github.com/golangci/golangci-lint-action/releases)
- [Commits](golangci/golangci-lint-action@v4.0.0...v5.0.0)

---
updated-dependencies:
- dependency-name: golangci/golangci-lint-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* build(deps): bump helm/kind-action from 1.9.0 to 1.10.0

Bumps [helm/kind-action](https://github.com/helm/kind-action) from 1.9.0 to 1.10.0.
- [Release notes](https://github.com/helm/kind-action/releases)
- [Commits](helm/kind-action@v1.9.0...v1.10.0)

---
updated-dependencies:
- dependency-name: helm/kind-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* chore: bump k8s libs to v0.30.0

Signed-off-by: Simon Pasquier <[email protected]>

* chore: refactor the assets package

This commit simplifies the API of the assets package. To limit the
impact, it tackles only Basic Auth secrets for now.

Previous API:

```
// storing the credentials from function A
err = store.AddBasicAuth(ctx, namespace, httpConfig.BasicAuth, "some key")

// retrieving the credentials from function B
basicAuth := store.BasicAuthAssets["some key"]
```

New API:

```
// storing the credentials from function A
err = store.AddBasicAuth(ctx, namespace, httpConfig.BasicAuth)

// retrieving the credentials from function B
s := store.ForNamespace(namespace)
username, err := s.GetSecretKey(basicAuth.Username)
password, err := s.GetSecretKey(basicAuth.Password)
```

The main simplification is that function B doesn't need to know how
function A built the key value. It also makes testing more decoupled and
reduces the risk of leaking data across namespaces.

Signed-off-by: Simon Pasquier <[email protected]>

* build(deps): bump sigs.k8s.io/controller-runtime from 0.17.3 to 0.18.0

Bumps [sigs.k8s.io/controller-runtime](https://github.com/kubernetes-sigs/controller-runtime) from 0.17.3 to 0.18.0.
- [Release notes](https://github.com/kubernetes-sigs/controller-runtime/releases)
- [Changelog](https://github.com/kubernetes-sigs/controller-runtime/blob/main/RELEASE.md)
- [Commits](kubernetes-sigs/controller-runtime@v0.17.3...v0.18.0)

---
updated-dependencies:
- dependency-name: sigs.k8s.io/controller-runtime
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* Chore: Change *RelabelConfigs to values instead of Pointers  (prometheus-operator#6479)

Signed-off-by: Simon Pasquier <[email protected]>

---------

Signed-off-by: Simon Pasquier <[email protected]>
Co-authored-by: Simon Pasquier <[email protected]>

* doc: fix sample port name used

* build(deps): bump golangci/golangci-lint-action from 5.0.0 to 5.1.0

Bumps [golangci/golangci-lint-action](https://github.com/golangci/golangci-lint-action) from 5.0.0 to 5.1.0.
- [Release notes](https://github.com/golangci/golangci-lint-action/releases)
- [Commits](golangci/golangci-lint-action@v5.0.0...v5.1.0)

---
updated-dependencies:
- dependency-name: golangci/golangci-lint-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* build(deps): bump google.golang.org/protobuf from 1.33.0 to 1.34.0

Bumps google.golang.org/protobuf from 1.33.0 to 1.34.0.

---
updated-dependencies:
- dependency-name: google.golang.org/protobuf
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* chore: refactor OAuth2 in the assets package

This is the follow-up of prometheus-operator#6537 for OAuth2 credentials.

Signed-off-by: Simon Pasquier <[email protected]>

* build(deps): bump sigs.k8s.io/controller-runtime from 0.18.0 to 0.18.1

Bumps [sigs.k8s.io/controller-runtime](https://github.com/kubernetes-sigs/controller-runtime) from 0.18.0 to 0.18.1.
- [Release notes](https://github.com/kubernetes-sigs/controller-runtime/releases)
- [Changelog](https://github.com/kubernetes-sigs/controller-runtime/blob/main/RELEASE.md)
- [Commits](kubernetes-sigs/controller-runtime@v0.18.0...v0.18.1)

---
updated-dependencies:
- dependency-name: sigs.k8s.io/controller-runtime
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>

* build(deps): bump github.com/thanos-io/thanos from 0.34.1 to 0.35.0

Bumps [github.com/thanos-io/thanos](https://github.com/thanos-io/thanos) from 0.34.1 to 0.35.0.
- [Release notes](https://github.com/thanos-io/thanos/releases)
- [Changelog](https://github.com/thanos-io/thanos/blob/main/CHANGELOG.md)
- [Commits](thanos-io/thanos@v0.34.1...v0.35.0)

---
updated-dependencies:
- dependency-name: github.com/thanos-io/thanos
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* feat: add Go runtime scheduler metrics

Signed-off-by: Simon Pasquier <[email protected]>

* chore: add link to public calendar (prometheus-operator#6564)

Signed-off-by: Simon Pasquier <[email protected]>

* Add testing steps for podman with kind (prometheus-operator#6509)

* chore: Add testing instructions for using Podman with Kind

* chore: fixing typos

* Update formatting according to the failing checks.

* Removed whitespace to match the standard.

* Updating according to the suggestions from review.

* update prometheus version

Signed-off-by: dongjiang1989 <[email protected]>

* build(deps): bump golangci/golangci-lint-action from 5.1.0 to 5.3.0

Bumps [golangci/golangci-lint-action](https://github.com/golangci/golangci-lint-action) from 5.1.0 to 5.3.0.
- [Release notes](https://github.com/golangci/golangci-lint-action/releases)
- [Commits](golangci/golangci-lint-action@v5.1.0...v5.3.0)

---
updated-dependencies:
- dependency-name: golangci/golangci-lint-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* chore: make TLSConfig fields optional

This commit changes the `serverName` and `insecureSkipVerify` fields of
TLS configuration to pointers instead of values.

Signed-off-by: Simon Pasquier <[email protected]>

* build(deps): bump sigs.k8s.io/controller-runtime from 0.18.1 to 0.18.2

Bumps [sigs.k8s.io/controller-runtime](https://github.com/kubernetes-sigs/controller-runtime) from 0.18.1 to 0.18.2.
- [Release notes](https://github.com/kubernetes-sigs/controller-runtime/releases)
- [Changelog](https://github.com/kubernetes-sigs/controller-runtime/blob/main/RELEASE.md)
- [Commits](kubernetes-sigs/controller-runtime@v0.18.1...v0.18.2)

---
updated-dependencies:
- dependency-name: sigs.k8s.io/controller-runtime
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>

* build(deps): bump golang.org/x/net from 0.24.0 to 0.25.0

Bumps [golang.org/x/net](https://github.com/golang/net) from 0.24.0 to 0.25.0.
- [Commits](golang/net@v0.24.0...v0.25.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* fix: apply TLS scrape class to all objects

Before this change, the TLS configuration from the scrape class wasn't
applied to the generated configuration for PodMonitor, ScrapeConfig and
Probe objects.

Closes prometheus-operator#6556

Signed-off-by: Simon Pasquier <[email protected]>

* build(deps): bump google.golang.org/protobuf from 1.34.0 to 1.34.1

Bumps google.golang.org/protobuf from 1.34.0 to 1.34.1.

---
updated-dependencies:
- dependency-name: google.golang.org/protobuf
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>

* update thanos version

Signed-off-by: dongjiang1989 <[email protected]>

* fix mistake by make generate

Signed-off-by: dongjiang1989 <[email protected]>

* build(deps): bump golangci/golangci-lint-action from 5.3.0 to 6.0.1

Bumps [golangci/golangci-lint-action](https://github.com/golangci/golangci-lint-action) from 5.3.0 to 6.0.1.
- [Release notes](https://github.com/golangci/golangci-lint-action/releases)
- [Commits](golangci/golangci-lint-action@v5.3.0...v6.0.1)

---
updated-dependencies:
- dependency-name: golangci/golangci-lint-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>

* build(deps): bump github.com/prometheus/prometheus from 0.51.2 to 0.52.0

Bumps [github.com/prometheus/prometheus](https://github.com/prometheus/prometheus) from 0.51.2 to 0.52.0.
- [Release notes](https://github.com/prometheus/prometheus/releases)
- [Changelog](https://github.com/prometheus/prometheus/blob/main/CHANGELOG.md)
- [Commits](prometheus/prometheus@v0.51.2...v0.52.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/prometheus
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* build(deps): bump github.com/prometheus/client_golang

Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.19.0 to 1.19.1.
- [Release notes](https://github.com/prometheus/client_golang/releases)
- [Changelog](https://github.com/prometheus/client_golang/blob/main/CHANGELOG.md)
- [Commits](prometheus/client_golang@v1.19.0...v1.19.1)

---
updated-dependencies:
- dependency-name: github.com/prometheus/client_golang
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>

* fix: deref nil pointer on WebexConfig

Signed-off-by: Yoan Blanc <[email protected]>

* feat: add `go_sync_mutex_wait_total_seconds_total` metric

Signed-off-by: Simon Pasquier <[email protected]>

* chore: update golangci-lint configuration

Signed-off-by: Simon Pasquier <[email protected]>

* chore: update kind version to v0.23.0

Signed-off-by: Simon Pasquier <[email protected]>

* chore: update Prometheus to v2.52.0

Signed-off-by: Simon Pasquier <[email protected]>

* feat(ProxyConfig): Update CRD for ProxyConnectHeader type (prometheus-operator#6541)

* update CRD for ProxyConnectHeader

---------

Signed-off-by: dongjiang1989 <[email protected]>

* Feat: Add `relabel_configs` field to AlertmanagerEndpoints  (prometheus-operator#6467)

* Add RelabelConfigs to AlertmanagerEndpoints

* chore: Update vulnerable dependency golang.org/x/net

Signed-off-by: Arthur Silva Sens <[email protected]>

* chore: bump k8s libraries

Signed-off-by: Simon Pasquier <[email protected]>

* feat(env):  auto set GOMAXPROCS by go.uber.org/automaxprocs (prometheus-operator#6576)


---------

Signed-off-by: dongjiang1989 <[email protected]>

* feat: support SDK auth in AzureSD

Related-to prometheus-operator#6584

Signed-off-by: Jayapriya Pai <[email protected]>

* feat: support SDK auth in AzureAD RemoteWrite

Related-to prometheus-operator#6584

Signed-off-by: Jayapriya Pai <[email protected]>

* Update promcfg.go

Co-authored-by: Simon Pasquier <[email protected]>

* Add structure for feature flags

Signed-off-by: Arthur Silva Sens <[email protected]>

* [WIP] Feat: Add `alert_relabel_configs` to the Prometheus and PrometheusAgent CRD's (prometheus-operator#6450)

* AlertmanagerEndpoints: add alertRelabelingConfigs field to AlertmanagerEndpoints

* alertmanagerEndpoints: wrap errors and fix naming for tests

* fix: attempt to manually revert mistakenly commited code

* chore: cut v0.74.0

Signed-off-by: Simon Pasquier <[email protected]>

* Corrected Documentation for xxxMonitorNamespaceSelector  (prometheus-operator#6605)

Chore: Clarify that null is the default value for Service/PodMonitor selectors

* Reload alert manager when notification templates change (prometheus-operator#6607)

* Reload alert manager when notification templates change

* feat: add automatic GOMAXPROCS to admission webhook

Signed-off-by: Simon Pasquier <[email protected]>

* crd: add support for source pagerduty_config option in AlertMananger CRD (prometheus-operator#6427)

* crd: add support for source pagerduty_config option in AlertMananger CRD

The AlertManager CRD was expected to have 1:1 fields mapped from
https://prometheus.io/docs/alerting/latest/configuration/#pagerduty_config
. Currently source was missing so it is added.


---------

Co-authored-by: Jayapriya Pai <[email protected]>

* AlertmanagerEndpoints: Move AlertmanagerEndpoints validation to pkg/prometheus/server

* chore: remove WebTLSConfigError

Signed-off-by: Simon Pasquier <[email protected]>

* chore: rework webconfig package

Signed-off-by: Simon Pasquier <[email protected]>

* Add extra metric relabelings to scrape classes

Signed-off-by: Mathieu Parent <[email protected]>

* bugfix: Fix bug created from race conditions during merge

Signed-off-by: Arthur Silva Sens <[email protected]>

* [CHORE] considering global limits over enforced

Signed-off-by: Nicolas Takashi <[email protected]>

* build(deps): bump sigs.k8s.io/controller-runtime from 0.18.2 to 0.18.3

Bumps [sigs.k8s.io/controller-runtime](https://github.com/kubernetes-sigs/controller-runtime) from 0.18.2 to 0.18.3.
- [Release notes](https://github.com/kubernetes-sigs/controller-runtime/releases)
- [Changelog](https://github.com/kubernetes-sigs/controller-runtime/blob/main/RELEASE.md)
- [Commits](kubernetes-sigs/controller-runtime@v0.18.2...v0.18.3)

---
updated-dependencies:
- dependency-name: sigs.k8s.io/controller-runtime
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>

* e2e/framework: Allow setting feature-gates when creating Prometheus-Operator

Signed-off-by: Arthur Silva Sens <[email protected]>

* chore: add slashpai as release shepherd for v0.75

Signed-off-by: Jayapriya Pai <[email protected]>

* Make a cluster of 2 worker nodes for e2e

* [BUGFIX] Fix PrometheusAgent reconciliation for the statefulset changes (prometheus-operator#6615)

* [BUGFIX] Fix PrometheusAgent reconciliation for the statefulset changes

Signed-off-by: junotx <[email protected]>

---------

Signed-off-by: junotx <[email protected]>

* chore: add test for AlertmanagerConfig with subroutes

Signed-off-by: Simon Pasquier <[email protected]>

* Use functional options pattern for Prometheus Controller

Signed-off-by: Arthur Silva Sens <[email protected]>

* ScrapeConfig: Add `JobName` field to the CRD

Co-authored-by: M Viswanath Sai <[email protected]>

* chore: fix testScrapeConfigKubernetesNodeRole()

Signed-off-by: Simon Pasquier <[email protected]>

* chore: bump k8s dependencies for api

regenerate assets

Fixes prometheus-operator#6617

Signed-off-by: Jayapriya Pai <[email protected]>

* [CHORE] allowing kubeconfig as parameter (prometheus-operator#6623)

Signed-off-by: Nicolas Takashi <[email protected]>

* Add feature gate for Prometheus Agent's DaemonSet deployment (prometheus-operator#6626)

* Add feature gate for Prometheus Agent's DaemonSet deployment

* Update pkg/prometheus/promcfg.go

Co-authored-by: Simon Pasquier <[email protected]>

* feat(env): Add automatic memory limit handling (prometheus-operator#6591)

* add auto GOMEMLIMIT

Signed-off-by: dongjiang1989 <[email protected]>


---------

Signed-off-by: dongjiang1989 <[email protected]>
Co-authored-by: Simon Pasquier <[email protected]>

* build(deps): bump github.com/KimMachineGun/automemlimit

Bumps [github.com/KimMachineGun/automemlimit](https://github.com/KimMachineGun/automemlimit) from 0.6.0 to 0.6.1.
- [Release notes](https://github.com/KimMachineGun/automemlimit/releases)
- [Commits](KimMachineGun/automemlimit@v0.6.0...v0.6.1)

---
updated-dependencies:
- dependency-name: github.com/KimMachineGun/automemlimit
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>

* fix: use a separate port number for init container

The Kubernetes API starting from v1.30 will return a warning when a pod
template contains 2 containers exposing the same port number, even
across init and regular containers.

[1] kubernetes/kubernetes#113245

Signed-off-by: Simon Pasquier <[email protected]>

* Update pkg/prometheus/promcfg.go

Co-authored-by: Simon Pasquier <[email protected]>

* build(deps): bump github.com/prometheus/prometheus from 0.52.0 to 0.52.1

Bumps [github.com/prometheus/prometheus](https://github.com/prometheus/prometheus) from 0.52.0 to 0.52.1.
- [Release notes](https://github.com/prometheus/prometheus/releases)
- [Changelog](https://github.com/prometheus/prometheus/blob/main/CHANGELOG.md)
- [Commits](prometheus/prometheus@v0.52.0...v0.52.1)

---
updated-dependencies:
- dependency-name: github.com/prometheus/prometheus
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>

* build(deps): bump github.com/thanos-io/thanos from 0.35.0 to 0.35.1

Bumps [github.com/thanos-io/thanos](https://github.com/thanos-io/thanos) from 0.35.0 to 0.35.1.
- [Release notes](https://github.com/thanos-io/thanos/releases)
- [Changelog](https://github.com/thanos-io/thanos/blob/v0.35.1/CHANGELOG.md)
- [Commits](thanos-io/thanos@v0.35.0...v0.35.1)

---
updated-dependencies:
- dependency-name: github.com/thanos-io/thanos
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>

* Update pkg/prometheus/promcfg.go

Co-authored-by: Simon Pasquier <[email protected]>

* Update pkg/prometheus/promcfg.go

Co-authored-by: Simon Pasquier <[email protected]>

* Update pkg/prometheus/promcfg.go

Co-authored-by: Simon Pasquier <[email protected]>

* Update pkg/prometheus/promcfg.go

Co-authored-by: Simon Pasquier <[email protected]>

* Update pkg/prometheus/promcfg.go

Co-authored-by: Simon Pasquier <[email protected]>

* Update pkg/prometheus/promcfg.go

Co-authored-by: Simon Pasquier <[email protected]>

* chore: refactor tokens management in the assets package

This is a follow-up of prometheus-operator#6537 and prometheus-operator#6557.

Signed-off-by: Simon Pasquier <[email protected]>

* chore: add test-e2e-image target to Makefile

This change also simplifies the end-to-end testing instructions.

Signed-off-by: Simon Pasquier <[email protected]>

* Add `mode` field in PrometheusAgent CRD (prometheus-operator#6640)

* Add mode field in PrometheusAgent CRD

* build(deps): bump github.com/prometheus/common from 0.53.0 to 0.54.0

Bumps [github.com/prometheus/common](https://github.com/prometheus/common) from 0.53.0 to 0.54.0.
- [Release notes](https://github.com/prometheus/common/releases)
- [Changelog](https://github.com/prometheus/common/blob/main/RELEASE.md)
- [Commits](prometheus/common@v0.53.0...v0.54.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/common
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* chore: refactor sigv4 management in the assets package

This is a follow-up of prometheus-operator#6537 and prometheus-operator#6557 and prometheus-operator#6641.

Signed-off-by: Simon Pasquier <[email protected]>

* Update pkg/prometheus/promcfg.go

Co-authored-by: Simon Pasquier <[email protected]>

* chore: refactor AzureAD management in the assets package

This is a follow-up of prometheus-operator#6537 and prometheus-operator#6557, prometheus-operator#6641 and prometheus-operator#6644.

Signed-off-by: Simon Pasquier <[email protected]>

* build(deps): bump github.com/prometheus-community/prom-label-proxy

Bumps [github.com/prometheus-community/prom-label-proxy](https://github.com/prometheus-community/prom-label-proxy) from 0.8.1 to 0.9.0.
- [Release notes](https://github.com/prometheus-community/prom-label-proxy/releases)
- [Changelog](https://github.com/prometheus-community/prom-label-proxy/blob/main/CHANGELOG.md)
- [Commits](prometheus-community/prom-label-proxy@v0.8.1...v0.9.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus-community/prom-label-proxy
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* build(deps): bump golang.org/x/net from 0.25.0 to 0.26.0

Bumps [golang.org/x/net](https://github.com/golang/net) from 0.25.0 to 0.26.0.
- [Commits](golang/net@v0.25.0...v0.26.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* ScrapeConfig CRD: Add DockerswarmSDConfigs to the ScrapeConfig CRD

* ScrapeConfig: Add LinodeSDConfigs To The ScrapeConfig CRD

* ScrapeConfig CRD: Add PuppetDB Service Discovery Configurations

* build(deps): bump sigs.k8s.io/controller-runtime from 0.18.3 to 0.18.4

Bumps [sigs.k8s.io/controller-runtime](https://github.com/kubernetes-sigs/controller-runtime) from 0.18.3 to 0.18.4.
- [Release notes](https://github.com/kubernetes-sigs/controller-runtime/releases)
- [Changelog](https://github.com/kubernetes-sigs/controller-runtime/blob/main/RELEASE.md)
- [Commits](kubernetes-sigs/controller-runtime@v0.18.3...v0.18.4)

---
updated-dependencies:
- dependency-name: sigs.k8s.io/controller-runtime
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>

* Add NODE_NAME env in config reloader (prometheus-operator#6636)

* Add NODE_NAME env in config reloader

* feat: add `prometheus_operator_feature_gate_info` metric

This change also moves the feature gates to the operator config struct.
It means that after a feature gate is enabled/disabled, the operator
will reconcile the managed Prometheus resources which should be the
right thing to do.

Signed-off-by: Simon Pasquier <[email protected]>

* chore: bump code-generator to v0.30.1

The gen tools arguments have changed a bit, the Makefile commands have
been adjusted accordingly.

Signed-off-by: Simon Pasquier <[email protected]>

* Changed the description for ```overrideHonorLabels``` field (prometheus-operator#6653)

* Changed the decription for overrideHonorLabels

* Update pkg/prometheus/promcfg.go

Co-authored-by: Simon Pasquier <[email protected]>

* build(deps): bump google.golang.org/protobuf from 1.34.1 to 1.34.2

Bumps google.golang.org/protobuf from 1.34.1 to 1.34.2.

---
updated-dependencies:
- dependency-name: google.golang.org/protobuf
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>

* build(deps): bump github.com/Azure/azure-sdk-for-go/sdk/azidentity

Bumps [github.com/Azure/azure-sdk-for-go/sdk/azidentity](https://github.com/Azure/azure-sdk-for-go) from 1.5.2 to 1.6.0.
- [Release notes](https://github.com/Azure/azure-sdk-for-go/releases)
- [Changelog](https://github.com/Azure/azure-sdk-for-go/blob/main/documentation/release.md)
- [Commits](Azure/azure-sdk-for-go@sdk/internal/v1.5.2...sdk/azcore/v1.6.0)

---
updated-dependencies:
- dependency-name: github.com/Azure/azure-sdk-for-go/sdk/azidentity
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>

* build(deps): bump imjasonh/setup-crane from 0.3 to 0.4

Bumps [imjasonh/setup-crane](https://github.com/imjasonh/setup-crane) from 0.3 to 0.4.
- [Release notes](https://github.com/imjasonh/setup-crane/releases)
- [Commits](imjasonh/setup-crane@v0.3...v0.4)

---
updated-dependencies:
- dependency-name: imjasonh/setup-crane
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* [Enhancement] Replacing t.Fatal with testify/require package  (prometheus-operator#6659)

chore: pkg/operator: Replace t.Fatal with require.testify

* chore: Add feature-gated tests to CI

Signed-off-by: Arthur Silva Sens <[email protected]>

* ScrapeConfig CRD: Add LightSail Service Discovery Config Options

* chore: bump to k8s.io libs v0.30.2

Signed-off-by: Simon Pasquier <[email protected]>

* build(deps): bump github.com/prometheus-community/prom-label-proxy

Bumps [github.com/prometheus-community/prom-label-proxy](https://github.com/prometheus-community/prom-label-proxy) from 0.9.0 to 0.10.0.
- [Release notes](https://github.com/prometheus-community/prom-label-proxy/releases)
- [Changelog](https://github.com/prometheus-community/prom-label-proxy/blob/main/CHANGELOG.md)
- [Commits](prometheus-community/prom-label-proxy@v0.9.0...v0.10.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus-community/prom-label-proxy
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* Alertmanager pkg,t.Fatal->require pkg

* chore: fix build after prom-label-proxy bump

Signed-off-by: Simon Pasquier <[email protected]>

* Replacing t.fatal with require Package  (prometheus-operator#6680)

chore: Replace t.Fatal with require package

* feat(remote): add support prometheus remote write/read ProxyConfig  (prometheus-operator#6512)

* update prometheus remote write/read proxy config

Signed-off-by: dongjiang1989 <[email protected]>



---------

Signed-off-by: dongjiang1989 <[email protected]>

* WIP: Refactor common test code between Prometheus Agent's StatefulSet and DaemonSet modes (prometheus-operator#6688)

* Refactor test code between Prometheus Agent's StatefulSet and DaemonSet modes

* update default thanos version

Signed-off-by: dongjiang1989 <[email protected]>

* build(deps): bump github.com/prometheus/prometheus from 0.52.1 to 0.53.0

Bumps [github.com/prometheus/prometheus](https://github.com/prometheus/prometheus) from 0.52.1 to 0.53.0.
- [Release notes](https://github.com/prometheus/prometheus/releases)
- [Changelog](https://github.com/prometheus/prometheus/blob/main/CHANGELOG.md)
- [Commits](prometheus/prometheus@v0.52.1...v0.53.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/prometheus
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* update prometheus version

Signed-off-by: dongjiang1989 <[email protected]>

* config-reloader: create correct probes when `listenLocal` is set to `true`

When the prometheus operator is started with `--enable-config-reloader-probes`
it will now create `exec` probes that run curl/wget in the config-reloader
container against localhost to check the /healthz endpoint if `listenLocal` is
set to `true`. Otherwise, it creates `httpGet` probes as before.

Fixes prometheus-operator#6682

* Nit: Check if EnableFeatures already contains agent mode's features (prometheus-operator#6701)

* Check if EnableFeatures already contains ageent mode's features

* [CHORE] Nicolas as release volunteer

Signed-off-by: Nicolas Takashi <[email protected]>

* Add `ttl` obj to alertmanagercfgs resource (prometheus-operator#6515)

* add ttl obj into alertmanagerConfig rsc


---------

Co-authored-by: Nicolas Takashi <[email protected]>
Co-authored-by: Simon Pasquier <[email protected]>

* chore: factorize prober code

This is a quick follow-up of prometheus-operator#6698.

Signed-off-by: Simon Pasquier <[email protected]>

* chore: Replace StringPtrValOrDefault with ptr.Deref

Signed-off-by: Arthur Silva Sens <[email protected]>

* Proposal for DaemonSet deployment of Prometheus Agent (prometheus-operator#6600)

chore: Add Proposal for Daemonset deployment of Prometheus Agent

* chore: refactor TLS management in the assets package

This is a follow-up of prometheus-operator#6537, prometheus-operator#6557, prometheus-operator#6641, prometheus-operator#6644 and prometheus-operator#6645.

Signed-off-by: Simon Pasquier <[email protected]>

* Refactor the common implementation code (not including tests) between Prometheus's modes (prometheus-operator#6686)

* Refactor the common implementation code (not including tests) between Prometheus's modes

* Continue prometheus-operator#6688: Refactor common test code between Prometheus modes (prometheus-operator#6694)

* chore: optimize get secret key from store (prometheus-operator#6700)

* optimize code

Signed-off-by: dongjiang1989 <[email protected]>


---------

Signed-off-by: dongjiang1989 <[email protected]>

* build(deps): bump github.com/go-test/deep from 1.1.0 to 1.1.1

Bumps [github.com/go-test/deep](https://github.com/go-test/deep) from 1.1.0 to 1.1.1.
- [Release notes](https://github.com/go-test/deep/releases)
- [Changelog](https://github.com/go-test/deep/blob/master/CHANGES.md)
- [Commits](go-test/deep@v1.1.0...v1.1.1)

---
updated-dependencies:
- dependency-name: github.com/go-test/deep
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>

* chore: bump go dependencies before release

Signed-off-by: Jayapriya Pai <[email protected]>

* feat(xds): Add OVHcloud service discovery to the ScrapeConfig CRD (prometheus-operator#6689)

* add service discovery for ovhcloud

Signed-off-by: dongjiang1989 <[email protected]>

---------

Signed-off-by: dongjiang1989 <[email protected]>

* chore: cut 0.75.0

Signed-off-by: Jayapriya Pai <[email protected]>

* cherry-pick 6722

Signed-off-by: dongjiang1989 <[email protected]>

* chore: cut 0.75.1

Signed-off-by: Jayapriya Pai <[email protected]>

* fix: avoid invalid alerting config with TLS

Signed-off-by: Simon Pasquier <[email protected]>

* chore: cut 0.75.2

Signed-off-by: Jayapriya Pai <[email protected]>

* conflict fix

Signed-off-by: Coleen Iona Quadros <[email protected]>

* conflict

Signed-off-by: Coleen Iona Quadros <[email protected]>

* conflict files

Signed-off-by: Coleen Iona Quadros <[email protected]>

* conflict files

Signed-off-by: Coleen Iona Quadros <[email protected]>

* conflict files

Signed-off-by: Coleen Iona Quadros <[email protected]>

---------

Signed-off-by: Simon Pasquier <[email protected]>
Signed-off-by: dependabot[bot] <[email protected]>
Signed-off-by: Jayapriya Pai <[email protected]>
Signed-off-by: dongjiang1989 <[email protected]>
Signed-off-by: Yoan Blanc <[email protected]>
Signed-off-by: Arthur Silva Sens <[email protected]>
Signed-off-by: Arthur Silva Sens <[email protected]>
Signed-off-by: Mathieu Parent <[email protected]>
Signed-off-by: Nicolas Takashi <[email protected]>
Signed-off-by: junotx <[email protected]>
Signed-off-by: Coleen Iona Quadros <[email protected]>
Co-authored-by: Kemal Akkoyun <[email protected]>
Co-authored-by: Simon Pasquier <[email protected]>
Co-authored-by: Jayapriya Pai <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Arthur Silva Sens <[email protected]>
Co-authored-by: dongjiang <[email protected]>
Co-authored-by: M Viswanath Sai <[email protected]>
Co-authored-by: Horaci Macias <[email protected]>
Co-authored-by: Kapil Ramwani <[email protected]>
Co-authored-by: Nicolas Takashi <[email protected]>
Co-authored-by: Yoan Blanc <[email protected]>
Co-authored-by: Arthur Silva Sens <[email protected]>
Co-authored-by: Ashwin Sriram <[email protected]>
Co-authored-by: Muhammad Hamza Zaib <[email protected]>
Co-authored-by: mviswanathsai <[email protected]>
Co-authored-by: Mathieu Parent <[email protected]>
Co-authored-by: haanhvu <[email protected]>
Co-authored-by: junot <[email protected]>
Co-authored-by: janluak <_>
Co-authored-by: Ha Anh Vu <[email protected]>
Co-authored-by: Ashwin <[email protected]>
Co-authored-by: Simon Dickhoven <[email protected]>
Co-authored-by: Afzal Ansari <[email protected]>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Coleen Iona Quadros <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants