[InferenceClient] Provide a way to deal with content-type header when sending raw bytes

First reported by @freddyaboulton on [slack](https://huggingface.slack.com/archives/C02V5EA0A95/p1733327444878139) (private).

For some tasks (e.g. automatic-speech-recognition) we are sending raw bytes in the HTTP request. On InferenceAPI, there is a "content-type-guess" logic to implicitly interpret the data that is sent. However, this logic can be flawed and moreover, the Inference Endpoints don't have this logic.

It currently possible to provide the `Content-Type` headers by providing a value when initializing the `InferenceClient` :

```py
client = InferenceClient(url, headers={"Content-Type": "audio/mpeg"})
response = client.automatic_speech_recognition("audio.mp3")
```

However this is not well documented + it doesn't feel correct to initialize a client with a specific content type (it would mean that all requests made with the client have to send the same content type). 

---

Opening the issue here but not clear to me what we'd like to do. A few solutions could be:
- (low hanging fruit) keep what we have but document how to set content-type headers in each method docstring
- infer the content type header based on file extension (but don't work if raw bytes are passed to the method)
- add a parameter? (but how would it work with auto-generation?)

Open to suggestions on this



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[InferenceClient] Provide a way to deal with content-type header when sending raw bytes #2706

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[InferenceClient] Provide a way to deal with content-type header when sending raw bytes #2706

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions