Any vector space with a definition of a vector norm can be used to write down a metric. In this case, because one can compute the "distance" between $\rho$ and $\sigma$ by the norm of $\rho-\sigma$, one has already defined a metric. You could call this the trace distance metric, it is different from the other metrics you mentioned, and does not possess a clean geometric visualization (https://quantumcomputing.stackexchange.com/a/29716/15820).
Now let's try to get a handle on how this metric relates to other quantities that you mentioned.
Suppose that $\rho=\rho_{\theta-d\theta/2}$ and $\sigma=\rho_{\theta+d\theta/2}$. With this infinitesimal change, the fidelity $F$ is related to the quantum Fisher information $\mathcal{F}$ by $F(\rho,\sigma)=1-\mathcal{F}(\rho_\theta)d\theta^2/4$ (QFI is not related to quantum relative entropy, even though classical Fisher information is related to classical relative entropy https://quantumcomputing.stackexchange.com/a/29716/15820). In addition, the trace distance is bounded from both sides by functions of the fidelity as $1-\sqrt{F(\rho,\sigma)}\leq \Delta(\rho,\sigma)\leq \sqrt{F(\rho,\sigma)}$, from which we conclude
$$\frac{\mathcal{F}(\rho_\theta)}{8}d\theta^2\leq \Delta(\rho_{\theta-d\theta/2},\rho_{\theta+d\theta/2})\leq \frac{\sqrt{\mathcal{F}(\rho_\theta)}}{2}d\theta. $$ So, at the very least, we learn that the trace distance grows at a rate somewhere between $\mathcal{O}(d \theta)$ and $\mathcal{O}(d \theta^2)$.
It may be even easier to just use the definition of the trace distance: $\Delta(\rho,\rho+d\rho)=\frac{1}{2}\mathrm{Tr}(\sqrt{d\rho^\dagger d\rho})$. Then we notice that $d\rho^\dagger=d\rho$ because the difference between two Hermitian operators must also be Hermitian, leading us to $\Delta(\rho,\rho+d\rho)\propto \mathrm{Tr}(\sqrt{d\rho^2})$. This is the sum of the singular values (absolute values of the eigenvalues) of $d\rho$, so you can see how this could grow like $|d\theta|$ and not have smooth derivatives. Easy examples are $d\rho=|1\rangle\langle 1|d\theta+|2\rangle\langle 2|d\phi$ for some orthogonal states $\langle 1|2\rangle=0$, from which $\Delta=(|d\theta|+|d\phi|)/2$, opening like an inverted pyramid when plotted versus $d\theta$ and $d\phi$.
Your final paragraph is probably the subject of another question, so I will leave it for now.