It's a thorny issue. In the position of an indie dev/studio i get using cheap (or free) art, be it voice, textures, whatever. In a way a properly licensed ai trained voice is no different from using assets from an asset store.
On the other hand, the current crop of ai are less than fair about where they source the data, so good luck getting a morally neutral voice right now, leaving aside the legal aspect.
A big issue beyond that is how it'll completely wreck the industry. If Alice licensed her voice for cheap, and I can get it to say whatever I need with minimal hassle why wouldn't I use that over paying more for a voice actor, where I have to wait on them to actually record and rerecord her lines? I'd be paying more for slower results and more work.
Then you realize this is true not just for me but for most groups needing voice lines. This means that even if an individual voice seems ethically sound, considering the wider context and impact on other voice actors it becomes far less simple.
This works with splitters (and you can combine cables and splitters to get there, it doesn't need to be a single y cable with the right ends). I'd recommend against it however. Two outputs for one source is usually fine, but two sources to one output with just a y splitter can be detrimental. Depending on exact circumstances sound quality can be worse and/or it can (theoretically) damage your equipment.
For two low level sources it's probably fine in most cases, but definitely not recommended. There are positive summing circuits that prevent this, but usually the recommendation is to use a mixer instead.