gan神经网络_神经联觉:当艺术遇见GAN
gan神經網絡
Neural Synesthesia is an AI art project that aims to create new and unique audiovisual experiences with artificial intelligence. It does this through collaborations between humans and generative networks. The results feel almost like organic art. Swirls of color and images blend together as faces, scenery, objects, and architecture transform to music. There’s a sense of things swinging between feeling unique and at the same time oddly familiar.
神經聯覺是一個AI藝術項目,旨在通過人工智能創造新的獨特視聽體驗。 它通過人類與生成網絡之間的協作來做到這一點。 結果幾乎就像是有機藝術。 當面Kong,風景,物體和建筑轉變成音樂時,色彩和圖像的漩渦融合在一起。 在感覺獨特和陌生熟悉之間,事物之間會搖擺不定。
Neural Synesthesia was created by Xander Steenbrugge, an online content creator who made his start in data science while working on brain-computer interfaces. During his master thesis, he helped build a system that classified imagined movement through brain signals. This system allowed patients suffering from Locked-in syndrome to manipulate physical objects with their minds. The experience impressed upon Steenbrugge the importance of machine learning, and the potential for AI technology to build amazing things.
神經聯覺是由在線內容創建者Xander Steenbrugge創建的,他在研究腦機接口的同時就開始了數據科學的研究。 在碩士論文期間,他幫助建立了一個通過大腦信號對想象的運動進行分類的系統。 該系統使患有鎖定癥候群的患者能夠用自己的思想操縱身體。 這次經歷使Steenbrugge深刻地意識到了機器學習的重要性以及AI技術構建驚人事物的潛力。
Outside of Neural Synesthesia, Steenbrugge works with a startup using machine learning for drug discovery and runs a popular YouTube channel. He’s also working on wzrd.ai, a platform that augments audio with immersive video through the work of AI. In this interview, we talk about Neural Synesthesia’s inspiration, how it works, and discuss AI and creativity.
在神經通感之外,Steenbrugge與一家使用機器學習進行藥物發現的初創公司合作,并經營著一個受歡迎的YouTube頻道。 他還在wzrd.ai上工作,該平臺通過AI的工作通過沉浸式視頻增強音頻。 在這次采訪中,我們討論了神經聯覺的靈感,它的工作原理,并討論了AI和創造力。
神經聯覺的靈感是什么? (What were the inspirations for Neural Synesthesia?)
I’ve always had a fascination for aesthetics. Examples are mountain panoramas, indie game design, scuba diving in coral reefs, psychedelic experiences, and films by Tarkovsky. Beautiful visual scenes have the power to convey meaning without words. It’s almost like a primal, visual language we all speak intuitively.
我一直對美學著迷。 例如山峰全景,獨立游戲設計,在珊瑚礁中進行水肺潛水,迷幻體驗以及塔可夫斯基的電影。 美麗的視覺場景可以傳達無言的意義。 幾乎就像我們都直覺地說的原始視覺語言一樣。
When I saw the impressive advances in generative models (especially GANs), I started imagining where this could lead. Just like the camera and the projector brought about the film industry, I wondered what narratives could be built on top of the deep learning revolution. To get hands on with this, my first idea was to simply tweak the existing codebases for GANs to allow for direct visualization of audio. This was how Neural Synesthesia was born.
當我看到生成模型(尤其是GAN)令人印象深刻的進步時,我開始想象這可能會導致什么。 就像照相機和放映機帶動了電影業一樣,我想知道在深度學習革命的基礎上可以建立什么樣的敘述。 為此,我的第一個想法是簡單地調整GAN的現有代碼庫,以實現音頻的直接可視化。 這就是神經聯覺的誕生方式。
您為第一個神經聯覺療法做了多少工作? 您面臨任何獨特的挑戰嗎? (How much work did you do for the first Neural Synesthesia piece? Did you face any unique challenges?)
I think coding for the first rendered video took over six months because I was doing it in my spare time. The biggest challenge was how to manipulate the GANs latent input space using features extracted from the audio track. I wanted to create a satisfying match between visual and auditory perception for viewers.
我認為為第一個渲染視頻編碼需要花費六個多月的時間,因為我在業余時間進行編碼。 最大的挑戰是如何使用從音軌中提取的特征來操縱GANs潛在輸入空間。 我想為觀眾在視覺和聽覺感知之間創造令人滿意的匹配。
Here’s a little insight into what I do: I apply a Fourier Transform to extract time varying frequency components from the audio. I also perform harmonic/percussive decomposition, which basically separates the melody from the rhythmic components of the track. These three signals (instantaneous frequency content, melodic energy, and beats) are then combined to manipulate the GANs latent space, resulting in visuals that are directly controlled by the audio.
以下是我的操作的一些見解:我應用了傅里葉變換從音頻中提取時變頻率分量。 我還執行諧音/打擊樂分解,從本質上將旋律與曲目的節奏成分分開。 然后將這三個信號(瞬時頻率含量,旋律能量和節拍)組合起來,以操縱GAN的潛在空間,從而產生由音頻直接控制的視覺效果。
每個圖像數據集是否唯一? 您如何收集這些數據集的圖像,以及需要多少圖像? (Is every image dataset unique? How do you collect images for these datasets, and how many images do you need?)
I spent a lot of time collecting large and diverse image training data to create interesting generative models. These datasets have aesthetics as their primary goal rather than realism, like most GANs. Experimenting with various blends of image collections is time consuming, since GAN training requires lots of compute and I don’t exactly have a data center at my disposal.
我花了大量時間收集大量多樣的圖像訓練數據,以創建有趣的生成模型。 像大多數GAN一樣,這些數據集以美學為主要目標,而不是現實主義。 由于GAN訓練需要大量計算,而且我沒有一個完全可用的數據中心,因此嘗試各種混合的圖像收集非常耗時。
Most of the datasets I use are image sets I’ve encountered over the years. I saved them because I knew one day I’d have a use for them. I’ve always had an interest in aesthetics so when I discover something that triggers that sixth sense, I save it.
我使用的大多數數據集都是我多年來遇到的圖像集。 我保存了它們,是因為我知道有一天我會用到它們。 我對美學一直很感興趣,因此當我發現觸發第六感的東西時,我就保存下來。
Most GAN papers use datasets of more than 50,000 images, but in practice you can get away with fewer examples. The first step is to start from a pre-trained GAN model that has already been trained on a large dataset. This means the convolutional filters in the model are already well-shaped and contain useful information about the visual world. Secondly, there’s data augmentation, which is basically flipping or rotating an image to effectively increase the amount of training data. Since I don’t really care about sample realism, I can actually afford to do very aggressive image augmentation. This results in many more training images than actual source images. For example, the model I used for a recent performance at Tate Modern had only 3,000 real images, aggressively augmented to a training set of around 70,000.
大多數GAN論文使用??的數據集超過50,000張圖像,但實際上,您可以減少實例的數量。 第一步是從已經在大型數據集上進行訓練的預訓練GAN模型開始。 這意味著模型中的卷積濾波器已經成形,并且包含有關視覺世界的有用信息。 其次,存在數據增強 ,它基本上是翻轉或旋轉圖像以有效地增加訓練數據量。 由于我并不真正關心樣本現實性,因此我實際上可以負擔得起非常積極的圖像增強。 這導致訓練圖像比實際源圖像多得多。 例如,我在泰特現代美術館(Tate Modern)最近演出時使用的模型只有3,000張真實圖像,積極地擴充到約70,000張訓練集。
Recently, a lot of new research explicitly addresses the low-data regime for GANs (such as what you can find here, here, and here). My current codebase leverages these techniques to train GANs with as little as a few hundred images.
最近,許多新的研究明確地解決了GAN的低數據機制(例如,您可以在此處 , 此處和此處找到的內容 )。 我當前的代碼庫利用這些技術來訓練僅用幾百個圖像的GAN。
您將神經通感說成是您和AI之間的協作。 您對利用AI技術的創意項目的未來有什么樣的潛力? (You talk about Neural Synesthesia as a collaboration between yourself and AI. What kind of potential do you see for the future of creative projects utilizing AI technology?)
This is actually the most interesting part of the entire project. I usually set out with specific intentions as to what type of visual I want to create. I then curate my dataset, tune the parameters of the training script, and start training the model. A full training run usually requires a few days to converge. Very quickly though, the model starts returning samples that are often unexpected and surprising. This sets an intriguing feedback loop into motion, where I change the code of the model, the model responds with different samples, I react, and it goes on. The creative process is no longer fully under my control; I am effectively collaborating with an AI system to create these works.
這實際上是整個項目中最有趣的部分。 對于要創建哪種視覺效果,我通常會有明確的意圖。 然后,我整理數據集,調整訓練腳本的參數,然后開始訓練模型。 一次完整的訓練通常需要幾天才能收斂。 不過,模型很快就會開始返回通常出乎意料且令人驚訝的樣本。 這將一個引人入勝的反饋循環置于運動中,在該過程中,我更改了模型的代碼,模型對不同的樣本做出響應,我做出了React,然后繼續進行。 創作過程不再完全由我控制。 我正在與AI系統有效地合作來創作這些作品。
I truly believe this is the biggest strength of this approach: you are not limited by your own imagination. There’s an entirely alien system that is also influencing the same space of ideas, often in unexpected and interesting ways. This leads you as a creator into areas you never would have wandered by yourself.
我真正相信這是此方法的最大優勢:您不受自己的想象力限制。 有一個完全陌生的系統也經常以意想不到的有趣的方式影響著相同的思想空間。 這將使您作為創作者進入您自己永遠不會徘徊的領域。
Looking at the tremendous pace of progress in the field of AI strongly motivates me to imagine what might be possible 10 years from now. After all, modern Deep Learning is only 8 years old! I expect that Moore’s law will continue to bring more powerful computing capabilities, that AI models will continue to scale with more compute, and that the possibilities of this medium will follow this exponential trend.
縱觀AI領域的巨大進步,我很想像一下十年后可能發生的事情。 畢竟,現代深度學習只有8年的歷史了! 我希望摩爾定律將繼續帶來更強大的計算功能,人工智能模型將隨著更多的計算而繼續擴展,并且這種媒介的可能性將遵循這一指數趨勢。
Neural Synesthesia in its current form is a prototype. It’s a version 0.1 of a grander idea to leverage deep learning as the core component of the advanced interactive media experiences of the future.
當前形式的神經聯覺是一個原型。 這是一個偉大的想法的版本0.1,它將深度學習作為未來高級交互式媒體體驗的核心組成部分。
您為神經聯覺的未來計劃了什么樣的創意作品? 您有什么目標或未來計劃嗎? (What kind of creative works do you have planned for the future of Neural Synesthesia? Do you have any goals or future plans?)
I’ve always been fascinated by the overview effect, where astronauts describe how seeing the Earth in its entirety from space profoundly changes their worldview, kindling the awareness that we are all part of the same, fragile ecosystem, suspended in the blackness of space.
我一直著迷于概述效應 ,在那兒,宇航員描述了如何從太空看到地球的整體而深刻地改變了他們的世界觀,激發了我們都屬于一個脆弱的生態系統,懸浮在黑暗中的意識。
To me, this is great evidence that profound, alienating experiences can have spectacular effects on people’s choices and behaviors. And what we need is a shift in perception away from tribal feelings of us versus them. We need to move towards a global society with common goals and common challenges.
對我而言,這是充分的證據,表明深刻而疏遠的經歷會對人們的選擇和行為產生巨大影響。 而我們需要的是將觀念從我們與他們的部落感覺轉移開來。 我們需要朝著具有共同目標和共同挑戰的全球社會邁進。
Our world is increasingly facing global issues that are deeply rooted in our locally-centered world views. These views are deeply rooted in our genes; we evolved in small tribes that only needed to attend to their local environments. However, the world is evolving towards a globally connected web of events, where the present can no longer be disconnected from the system as a whole. For example, look at climate change, and people fighting over artificially drawn borders of nationality, race, or even gender.
我們的世界正日益面臨根深蒂固于我們以當地為中心的世界觀的全球問題。 這些觀點深深植根于我們的基因。 我們演變成只需要照顧當地環境的小部落。 但是,世界正在朝著全球連接的事件網絡演進,在這里,現在的事物不再與整個系統脫節。 例如,看一下氣候變化,人們在人為地劃定國籍,種族甚至性別的邊界上進行斗爭。
As such, my long-term vision is to create rich, immersive experiences with the power to shift perspectives. Cinema 2.0, if you will. I imagine an interactive experience, where a group of people can enter an AI-generated world (e.g. using Virtual Reality headsets) where the visual scenery is so utterly alien and breathtaking that it forces the mind to temporarily halt its usual narrative of describing what’s going on. This is essentially the goal of meditation: to experience the world as it is, emphasizing the experience of the present moment rather than the narrative we construct around it.
因此,我的長期愿景是創造豐富的,身臨其境的體驗,并具有改變觀點的能力。 如果可以的話,Cinema 2.0。 我想象一種交互式的體驗,一群人可以進入AI生成的世界(例如使用虛擬現實耳機),視覺風景是如此的陌生和令人驚嘆,以至于迫使人們暫時停止通常的敘事方式來描述正在發生的事情上。 本質上,這是冥想的目標:按原樣體驗世界,強調當前時刻的體驗,而不是我們圍繞其構建的敘事。
The goal then, is to mimic the perceptual shift one can experience from a positive psychedelic experience, meditative insight, or a trip to space. To realize that our ‘normal’ world view is just a tiny sliver of what it is possible to experience. I believe this perceptual shift is probably the most unique human characteristic. It allows the great wonder of imagination to power our world, and is the most powerful tool we have to tackle the world’s largest challenges.
然后的目標是模仿人們從積極的迷幻經歷,冥想見解或太空旅行中可以經歷的感知轉變。 意識到我們的“正常”世界觀只是可能體驗到的一小部分。 我相信這種知覺轉變可能是人類最獨特的特征。 它使想象力成為世界的強大動力,也是我們應對世界上最大挑戰的最有力工具。
從技術角度來看,我們離創建這些基本的“ cinema 2.0”體驗還有多遠? (From a technology standpoint, how far away are we from creating these basic “cinema 2.0” experiences?)
I would say that from a technical point of view, we’re getting very close. The latest Generative models (e.g. StyleGANv2 or BigGanDeep) are able to create very realistic samples and allow for very high diversity. What is lacking at present are creative tools that let non-coders use this technology to get creative. The main challenge, at least for me, is to create a compelling narrative.
我要說的是,從技術角度來看,我們已經很接近了。 最新的Generative模型(例如StyleGANv2或BigGanDeep)能夠創建非常逼真的樣本并具有很高的多樣性。 當前缺乏使非編碼者使用該技術進行創作的創作工具。 至少對我而言,主要的挑戰是創造引人注目的敘事。
You can see more of Steenbrugge’s Neural Synesthesia work at its dedicated homepage, and try out wzrd.ai here. He’s also active on YouTube and Twitter, and open to collaborating with other creatives who have similar ideas and aspirations. You can contact him at neuralsynesthesia@gmail.com.
您可以在其專用主頁上查看Steenbrugge的神經通感的更多作品,并在此處嘗試wzrd.ai。 他還活躍于YouTube和Twitter上 ,并愿意與具有類似想法和抱負的其他創意者進行合作。 您可以通過Neurosynesthesia@gmail.com與他聯系。
Original article reposted with permission.
原始文章經許可重新發布。
翻譯自: https://medium.com/datadriveninvestor/neural-synesthesia-when-art-meets-gans-6453c7c0c5b8
gan神經網絡
總結
以上是生活随笔為你收集整理的gan神经网络_神经联觉:当艺术遇见GAN的全部內容,希望文章能夠幫你解決所遇到的問題。
 
                            
                        - 上一篇: 一文详细PHP模板引擎的原理(附代码示例
- 下一篇: 苹果8plus人像能自拍吗(苹果官网报价
