Text this: S3diff: Semantic Fusion and Structure-Guided Global Generation from a Single Image with Diffusion Models