There is an extensive series of articles from the Guardian and the Washington Post explaining NSA capabilities. Bruce Schneier's blog (go back to the period of the Snowden revelations) is also useful if you know enough about computers already, which I suspect from your question you don't.
The short answer is when the NSA really wants to target someone they can get an enormous amount of information.
Techniques include:
- altering the target's hardware -- a computer the target orders from Dell or Lenovo or whoever will be physically intercepted en-route or even at the factory depot and a bug installed
- using zero-day software attacks on the target's equipment to install trojan horses and keystroke readers and the like
- using more or less legitimate means (secret subpoenas and national security letters and coerced cooperation) to convince Internet service providers and telecom and backbone providers to record the target's activity
- using completely illegitimate means (hacking and intrusion) on foreign service providers
- compromising both commercial and open source encryption software
... using all these techniques, they can gather almost every bit and byte an unwitting target may communicate. A target would have to be both supremely tech savvy (or have the support of a skillful and dedicated IT organization) and take extraordinary measures (constantly ditching and swapping out stolen phones, never using the same computer twice, etc) to have a prayer of evading their scrutiny.
Edit: oh yeah, I think it's at the limit of current tech to analyze voice conversations and use a "voiceprint" to ID a speaker over an entire metro area. It's probably impossible to do in real-time over such a large area due to the physical layout of the networks. It would be a very expensive effort, too, involving phone company cooperation if it was at all possible. It could also be easily defeated if the speaker uses a voice-scrambler, including one of those cheap toy voice-changers. But for analyzing the streams coming from a particular small set of phone lines or mobiles, as opposed to an entire metro area, it would be much easier to do, though.