We describe a platform that uses temporally integrated co-simulation of emulated devices and simulation of networks that connect them, for activities such as performance evaluation and resilience assessment. In our approach all emulated and simulated components are time-synchronized to a virtual clock. We propose and study an approach which uses compiler analysis to augment emulated code with logic for precise instruction level tracking of execution paths. This is combined with a mechanism to ascribe virtual time for each execution burst based on the sequence of executed instructions. The overhead of synchronization between emulated components and simulated components is reduced by compiler-based identification of “lookahead”, which identifies epochs of emulated execution during which a process can be predicted to act independently of any other. Through evaluations, we show that our approach enables fast and repeatable execution of co-simulated models.
Agent-based modeling and simulation is an essential paradigm for complex, data-intensive research questions to absorb and process emergent insights from often large-scale scenarios. That demands its execution within a distributed simulation system. One critical factor of those runtime systems is distributing and partitioning involved agents. Unsuitable partitioning schemes lead to a computing load that needs to be synchronized continuously, which bears the risk of drastic performance reductions. Work from load balancing has produced a series of distribution classes that rely on geometric distribution. The partitioning decomposes the agent environment so that when the agent leaves one partition, it instantly switches to another partition. However, many simulations cannot be partitioned in this spatial way, for example, because no spatial reference exists or agents interact on different temporal-granularity levels.
This paper presents a new partitioning approach for distributed agent-based simulation systems, supporting spatially and non-spatially attached simulations with freely definable distribution keys. Our approach defines groups of agents that migrate between servers at runtime following a partitioning plan. We represent the allocated resources by intercepting and aggregating them into a frequency graph. We aim to balance the load across servers and reduce the number of agent transfers by determining the frequency of interactions between agents and objects. We evaluate our approach with a real-world scenario and results show that this adaptive partitioning can increase the execution performance by a factor of 10 with up to 89% lower latencies than existing solutions.
To date, many reasons have been suggested for making explainable artificial intelligence (XAI) models. However, it is unclear when the XAI suggested content is considered an explanation. This paper conducts a survey to determine the requirements for the information to be considered an explanation. Four minimum requirements have been prioritized based on the impact of the change they present to distinguishing between information and explainability.
Among the parallel and distributed simulation field’s main subjects are the performance benefits of new methods and optimizations. However, performance evaluations of the various simulators often rely on custom models, parametrizations, and baseline implementations, which complicates direct comparisons. We present our vision and initial steps towards COMPADS, a benchmark model and repository for reproducibly comparing the performance of parallel and distributed simulators and their respective algorithms. COMPADS is short for COMparing Parallel And Distributed Simulators. The first results include a novel deterministic-by-design synthetic benchmark model inspired by PHOLD and La-pdes. The benchmark output is a checksum that attests to the correctness of an implementation and its execution. So far, implementations exist for the simulators ROOT-Sim and ROSS.
Pedestrian intention prediction is an important issue in crowd modeling and simulation. Existing approaches focus on short term intention prediction, which limits their applications in large public places that require long term intention prediction. To this end, this paper proposes a data-driven approach to predict long term pedestrian intention. In the proposed approach, local velocity fields are constructed based on historical trajectories of pedestrians. A similarity function is further defined based on the velocity fields to predict the intermediate destinations of pedestrians. To evaluate its effectiveness, we evaluated the proposed approach in a real world example – an airport terminal. The simulation results have demonstrated that our approach can offer effective prediction performance.
Container-based network emulation provides an accurate, flexible, and cost-effective application design and evaluation testing environment. Enabling virtual time to processes running inside containers essentially improves the temporal fidelity of emulation experiments. However, the lack of precise time management during operations, such as disk I/O, network I/O, and GPU computation, often leads to virtual time advancement errors observed in existing virtual time systems. This paper proposes VT-IO, a lightweight virtual time system that integrates precise I/O time for container-based network emulation. We model and analyze the temporal error during I/O operations and develop a barrier-based time compensation mechanism in the Linux kernel. VT-IO enables accurate virtual time advancement with precise I/O time measurement and compensation. The experimental results show that the temporal error can be reduced from 87.31% to 3.6% and VT-IO only introduces around 2% overhead of the total execution time. Finally, we demonstrate VT-IO’s usability and temporal fidelity improvement with a case study of a Bitcoin mining application.
Immunotherapy consists of assisting or augmenting the immune system to prevent or treat diseases and has shown great promise in improving upon the conventional methods of treating cancer, namely radiation therapy, chemotherapy, and surgery . However, conducting preclinical studies in this field can take months to years and often require costly equipment and training to execute. Researchers have used in-silico methods to reduce the time to conduct these studies, computationally exploring many different experimental configurations in a fraction of the time needed in a wet lab . A common method used to describe the interaction of the immune system with tumor cells is through ordinary and partial differential equations of the evolution in the populations, but these top-down approaches often cannot capture the necessary detail required to understand the dynamics that operate at the cellular level . Appropriately, studies that approach these systems from the bottom-up, particularly agent-based simulations, have shown success in capturing the dynamics of cell-to-cell interactions .
Quantum entanglement is an essential resource for quantum networks, thus generation and distribution of entanglement between remote network nodes are two important tasks faced by practical implementation of quantum networks. Given that the number and scale of real quantum network testbeds are still limited we use simulation to study entanglement generation and distribution, aiming at providing insight and guidance for quantum network construction. We extend the functionality of the open-source Simulator of QUantum Network Communication (SeQUeNCe) developed by our group to support representation of quantum state under Fock basis, and use it to simulate entanglement generation between two atomic frequency comb (AFC) absorptive quantum memories. Regarding entanglement distribution, based on the so-called continuous generation scheme, we propose an adaptive scheme that uses history request information to better guide the choice of randomly generated quantum links before receiving future requests. We demonstrate the concept and power of our adaptive scheme, and explore relevant quantum memory allocation scenarios through simulation.
In speculative Parallel Discrete Event Simulation, a fundamental concept is related to the Event Set. Multiple data structures have been used to implement it in the literature, but the general strategy entails having an event set for every Logical Process. Conversely, traditional sequential simulators typically employ a single event set for the whole simulation model. This paper explores the performance implications of an intermediate solution for multicore architectures, where a single event set is maintained for every worker thread.
Complex adaptive systems are characterized by exhibiting macroscopic-level behaviors for which it is very difficult to obtain generalized descriptions based solely on knowing the dynamics of all microscopic components. An essential aspect of these systems lies in the interactions between atomic components, which in turn can change over time depending on the macroscopic state to which they contribute.
In this doctoral work we broaden and enhance the capabilities of the DEVS Modeling and Simulation formalism by enabling its application to the study of generalized complex adaptive systems, i.e. those in which the interacting components can be of hybrid nature (e.g. continuous/discrete, deterministic/stochastic, mono/multi level, with fixed/variable structure).
Smart manufacturing utilizes digital twins (DTs) that are virtual forms of their production plants for optimizing decisions. Discrete-event models (DEMs) are frequently used to model the production dynamics of the plants. To accelerate the performance of the discrete-event simulations (DES), adaptive abstraction-level conversion (AAC) approaches were proposed to change specific subcomponents of the DEM with corresponding abstracted queuing models during the runtime based on the steady-state of the DEMs. However, the speedup and accuracy loss of the AAC-based simulations (ABS) are highly influenced by user-specified significance level α (degree of tolerance of statistical invariance between two samples) and the stability of the DEMs. In this paper, we proposed a simulation-based optimization (SBO) that optimizes the problem based on genetic algorithm (GA) while tuning the hyperparameter (α) during runtime to maximize the speedup of ABS under a specified accuracy constraint. For each population, the proposed method distributes the computing budget between the α exploration and fitness evaluation. A discrete-gradient-based method is proposed to estimate each individual’s initial α (close to the final optimum) using previous exploration results of neighboring individuals so that the closeness can reduce the iterative α exploration as GA converges. We also proposed a clean-up method that removes inferior results to improve the α estimation. The proposed method was applied to optimize raw-material releases of a large-scale manufacturing system to prove the concept and evaluate the performance under various situations.
It has long been said that neuromorphic computing will yield enormous energy improvements on machine learning based computations and will be part of the next computing revolution. Yet, how likely is it that these goals are met once hardware-level constraints have been accounted for? In this paper, we benchmark the performance of a spintronics hardware platform designed for handling neuromorphic tasks. Spintronics devices that use the spin of electrons as the information state variable have the potential to emulate neuro-synaptic dynamics in hardware. Unlike their CMOS counterparts, spintronics-based neurons and synapses can be realized within a compact form-factor, while operating at ultra-low energy-delay point.
To explore the benefits of spintronics-based hardware on realistic neuromorphic workloads, we developed a Parallel Discrete-Event Simulation model called Doryta, which is further integrated with a materials-to-systems benchmarking framework. The benchmarking framework allows us to obtain quantitative metrics on the throughput and energy of spintronics-based neuromorphic computing and compare these against standard CMOS-based approaches. Although spintronics hardware offers significant energy and latency advantages, we find that for larger neuromorphic circuits, the performance is evidently limited by the interconnection networks rather than the spintronics-based neurons and synapses. Thus, it becomes imperative to identify interconnect materials that would natively offer low latency and consume less energy than the current copper-based interconnects.
Through Doryta we are also able to show the power of neuromorphic computing by simulating Conway’s Game of Life. We show that Doryta obtains over 400 × speedup using 1,280 CPU cores when tested on a convolutional, sparse, neural architecture.
The recent literature has reshuffled the architectural organization of speculative parallel discrete event simulation systems for shared-memory multi-core machines. A core aspect has been the full sharing of the workload at the level of individual simulation events, which enables keeping the rollback incidence minimal. However, making each worker thread continuously switch its execution between events destined to different simulation objects does not favor locality. In this article, we propose a workload-sharing algorithm where the worker threads can have short-term binding with specific simulation objects to favor spatial locality and caching effectiveness. Also, new bindings—carried out when a thread decides to switch its execution to other simulation objects—are based on the timeline according to which the object states have passed through the caching hierarchy. At the same time, our solution still enables the worker threads to focus their activities on the events to be processed whose timestamps are closer to the simulation commit horizon—hence we exploit temporal locality along virtual time and keep the rollback incidence minimal. In our design we exploit lock-free constructs to support scalable thread synchronization while accessing the shared event pool. Furthermore, we exploit a multi-view approach of the event pool content, which additionally favors local accesses to the parts of the event pool that are currently relevant for the thread activity. Our solution has been released as an integration within the USE open source speculative simulation platform available to the community. Furthermore, in this article we report the results of an experimental study that shows the effectiveness of our proposal.
Spiking Neural Networks are a class of Artificial Neural Networks that closely mimic biological neural networks. They are particularly interesting because of their potential to advance research in several fields, both because of better insights on neural behaviour (benefiting medicine, neuroscience, psychology) and the potential in Artificial Intelligence. Their ability to run on a low energy budget once implemented in hardware makes them even more appealing. However, because of their behaviour that evolves with time, when a hardware implementation is not available, their output cannot simply be computed with a one-shot function (however complex), but instead they need to be simulated.
Simulating Spiking Neural Networks is exceptionally costly, mainly due to their sheer size. Many current simulation methods have trouble scaling up on more powerful systems because of conservative synchronisation methods. Scalability is often offered through approximation of the actual results. In this paper, we present a modelling methodology and runtime-environment support adhering to the Time Warp synchronisation protocol, which enables speculative distributed simulation of Spiking Neural Network models with improved accuracy of the results. We discuss the methodological and technical aspects that will allow effective speculative simulation and present an experimental assessment on large virtualised environments, which shows the viability of simulating networks made of millions of neurons.
In social sciences, simulating opinion dynamics to study the ways in which the interplay between homophily and influence leads to the formation of echo chambers is of great importance. Given the wide variety of empirical systems involving the grouping of communities based upon local attribute consensus, simulating and analyzing such dynamics is highly relevant in many other fields as well. As such, in this paper we investigate echo chambers by implementing a unique social game in which we spawn in a large number of agents, each assigned one of the two opinions on an issue and a finite amount of influence in the form of a game currency. Agents attempt to have an opinion that is a majority at the end of the game, to obtain a reward also paid in the game currency. At the beginning of each round, a randomly selected agent is selected, referred to as a speaker. The second agent is selected in the radius of speaker influence (which is a set subset of the speaker’s neighbors) to interact with the speaker as a listener. In this interaction, the speaker proposes a payoff in the game currency from their personal influence budget to persuade the listener to hold the speaker’s opinion in future rounds until chosen listener again. The listener can either choose to accept or reject this payoff to hold the speaker’s opinion for future rounds. The listener’s choice is informed only by their estimate of global majority opinion through a limited view of the opinions of their neighboring agents. We show that the influence game leads to the formation of ”echo chambers,” or homogeneous clusters of opinions. We also investigate various scenarios to disrupt the creation of such echo chambers, including the introduction of resource disparity between agents with different opinions, initially preferentially assigning opinions to agents, and the introduction of committed agents, who never change their initial opinion.
All military services have invested heavily in creating, deploying, and training with large simulation systems. Historically, these have applied artificial intelligence algorithms to controlling non-player characters (NPCs) or semi-automated forces (SAF) during the execution phase of the events. This has allowed a few human role players to control a much larger collection of virtual/simulated entities or aggregated units. Structured algorithms like finite state machines and knowledge-based systems have typically been limited to this single execution phase of the entire training process. However, “the new AI”, deep learning and machine learning algorithms, operate much differently from the previous generation. These models configure themselves (or learn) from massive amounts of collected data (both labeled and unlabeled). As such, real applications require the collection of massive historical data. Luckily, military simulation events regularly generate multiple gigabytes of data as a by-product of the training event. Previously a tiny portion of this was saved and curated to perform after action review, and the remainder was deleted since it had no practical use and consumed scarce and expensive storage. To leverage deep learning, the military services need to reinvent their policies, relationships, and processes for handling these huge volumes of previously worthless, but now priceless, data. Additionally, deep learning models are much more widely applicable than the previous generation of algorithms. DL models can learn to process huge volumes of data to contribute to the analysis or after action review stage of an exercise. They can also be used to auto generate variations on giant scenario databases. They can match exercise plans against exercise results to determine whether training objectives have been met. And they can animate the NPC and SAF units during the execution phase. We stand at the edge of the application of deep learning to every phase of large military training simulation events.
One of the earliest technical application areas for discrete simulation has been communication systems and networks. This area continues to be a focus of simulation activity as communication systems continue to evolve. Specific uses run the gamut from chip-level networks, evolving and novel RF waveforms and hardware, high performance networks in clusters and data centers, up through Internet scale protocols and applications. At each level evolving and new requirements and possible solutions are being developed, and often first in simulation in advance of hardware/software availability. In many cases simulation has fully entered the engineering design cycle, leading to pre-manufacture exploration of performance tradeoffs and edge cases. In this paper we review some of these developments and point to the novel challenges they pose for the simulators themselves.
There has been a buzz around increased grid modernization for more than a decade. Grid modernization includes, but is not limited to, upgrades to the grid to enhance reliability, resilience, security, and access to clean energy sources. One significant change that has been happening in this context is the increased penetration of computing power and controlled devices like power electronics in the power grid. As this change happens, the operation and characteristics of the power grid are set to undergo a significant change. The power grid moves from an older electric machine dominated grid that used analog electronics for controls to a power electronics dominated grid that uses digital computing for controls. As this transition happens, there are significant problems of applied mathematics that will need to be resolved in power electronics and power grid. The problems include the ability to simulate in different timescales, while also leveraging the significantly improved computing capabilities available today. In this paper, the challenges related to simulation of power electronics are discussed and the challenge problems laid down that applied mathematics may help resolve in future.
All Badges available in this process are awarded to the paper Spatial/Temporal Locality-based Load-sharing in Speculative Discrete Event Simulation on Multi-core Machines. The artifact is available, functional, and reusable. The claimed results were reproduced using the provided artifact.
The examined paper introduces Doryta, a Parallel Discrete Event Simulation model for ROSS that runs Spiking Neural Networks.
The authors have uploaded their artifact to Zenodo, which ensures a long-term retention of the artifact. This paper can thus receive the Artifacts Available badge.
The artifact allows for easy re-running of experiments for the one image and part of the table data in csv format, and one-liner command-lines are provided to run the remaining experiments, the data for which has to be extracted by hand. The dependencies are well documented. The software in the artifact runs correctly with minimal intervention, and is relevant to the paper, earning the Artifacts Evaluated—Functional badge. Furthermore, since the artifact uses well-known, state-of-the-art Python libraries to train Neural Network models and Doryta can read them directly, the paper is assigned the Artifacts Evaluated—Reusable badge.
Due to technical limitations, distributed strong scaling experiments could not be reproduced.
The available, functional, and reusable badges are awarded based on the artifact. Some results require specific resources and were, therefore, not included in this report.